C+

Glossary term

Throughput

The amount of data or work processed per unit time; what bandwidth ultimately enables for AI workloads.

Throughput is the amount of data or work processed per unit time — tokens per second for inference, samples per second for training. It is the practical performance metric that bandwidth, compute, and latency together determine. Maximizing throughput per dollar is the core optimization target of inference economics, since it sets the marginal cost of serving AI. See Inference Economics 101.

← Back to Glossary