Glossary term
Latency (Hardware)
The time delay in moving data between components; lower latency means higher cluster utilization at scale.
Latency in hardware is the time delay in moving data between components — memory to processor, GPU to GPU, node to node. At cluster scale, latency determines how efficiently thousands of GPUs can stay synchronized, so lower latency translates directly into higher utilization and faster training. It is the companion constraint to bandwidth in AI networking. See Networking & Optical 101.