Inference Economics — Training vs Inference

Training vs. Inference: Different Economics

Training is a fixed cost: you train a model once (or periodically retrain) using massive GPU clusters. The economics favor raw compute power — whoever has the most FLOPs wins. This is NVIDIA's domain: A100, H100, B200 are all optimized for training throughput.

Inference is a variable cost: every user query, every API call, every agent action requires inference compute. As AI adoption scales, inference volume grows exponentially while training stays relatively flat. The economics shift from "maximum compute" to "minimum cost per query" and "maximum throughput per watt."

This distinction matters enormously for investors. In a training-dominated world, you buy NVIDIA and NVIDIA's supply chain. In an inference-dominated world, the competitive landscape fragments and the value chain shifts.

Who Benefits from the Inference Shift

Custom ASICs gain share because hyperscalers (Google, Amazon, Microsoft) can design chips optimized for their specific inference workloads at lower cost per query than general-purpose GPUs. Google's TPU v5, AWS Trainium2, and Meta's MTIA are all inference-focused.

NVIDIA adapts by releasing inference-optimized configurations (L40S, H100 NVL) and pushing software moats (TensorRT-LLM, Triton). NVIDIA won't lose inference entirely — but their market share will be lower than in training.

Edge inference becomes relevant as models shrink enough to run on-device. Qualcomm, MediaTek, and Apple's custom silicon benefit from running AI locally rather than in the cloud.

What This Means for the Functional Index

Closelook tracks the training-to-inference ratio through the Compute layer of the Functional Index. As inference dominates, the weight of custom ASIC and inference-focused companies increases relative to pure GPU plays. The index adapts to reflect this structural shift.

Key Companies

NVDA

NVIDIA

Training dominant, adapting to inference

GOOG

Google (TPU)

Custom inference ASIC — TPU v5

AMZN

Amazon (Trainium)

Custom inference ASIC — Trainium2

QCOM

Qualcomm

Edge inference — on-device AI

Inference Economics: How the Training-to-Inference Shift Changes Everything

Training vs. Inference: Different Economics

Who Benefits from the Inference Shift

What This Means for the Functional Index

Key Companies

Closelook View

Inference Economics: How the Training-to-Inference Shift Changes Everything

Training vs. Inference: Different Economics

Who Benefits from the Inference Shift

What This Means for the Functional Index

Key Companies

Closelook View

Related Entries