Closelook
GenAI tools evolution: first wave black-box models with false certainty give way to second wave tools with probability analysis, explainability, and bias awareness
Closelooknet
Home ACTION Weekly Signal Trading Portfolios ENGINE
Dossiers
ResearchAI Chip Supply ChainNVIDIA EcosystemAgentic ApplicationsSoftware-Credit Nexus
Indices
FrameworksABR FrameworkIndicesRubin Build-Out 100Euro-AI Sovereign 50
101 About Newsletter ↗
101
AI Infrastructure 101
Action
Weekly SignalTradingPortfolios
Engine
ABR FrameworkRubin IndexDossiers
AboutNewsletter ↗
Home / 101 / Inference Economics
ThemeNow 2026Closelook

Inference Economics: How the Training-to-Inference Shift Changes Everything

The AI industry is shifting from a training-dominated phase (where the biggest spend goes to training new models) to an inference-dominated phase (where the biggest spend goes to running models for users at scale). This shift fundamentally changes the semiconductor demand profile: training favors massive GPUs with maximum compute and memory bandwidth, while inference favors efficiency, throughput per watt, and cost per query. The companies that win the training phase (NVIDIA) may not be the same ones that dominate inference. Custom ASICs (Google TPU, AWS Trainium), inference-optimized architectures (Groq, Cerebras), and edge inference chips (Qualcomm, MediaTek) all gain relevance.

Training vs. Inference: Different Economics

Training is a fixed cost: you train a model once (or periodically retrain) using massive GPU clusters. The economics favor raw compute power — whoever has the most FLOPs wins. This is NVIDIA's domain: A100, H100, B200 are all optimized for training throughput.

Inference is a variable cost: every user query, every API call, every agent action requires inference compute. As AI adoption scales, inference volume grows exponentially while training stays relatively flat. The economics shift from "maximum compute" to "minimum cost per query" and "maximum throughput per watt."

This distinction matters enormously for investors. In a training-dominated world, you buy NVIDIA and NVIDIA's supply chain. In an inference-dominated world, the competitive landscape fragments and the value chain shifts.

Who Benefits from the Inference Shift

Custom ASICs gain share because hyperscalers (Google, Amazon, Microsoft) can design chips optimized for their specific inference workloads at lower cost per query than general-purpose GPUs. Google's TPU v5, AWS Trainium2, and Meta's MTIA are all inference-focused.

NVIDIA adapts by releasing inference-optimized configurations (L40S, H100 NVL) and pushing software moats (TensorRT-LLM, Triton). NVIDIA won't lose inference entirely — but their market share will be lower than in training.

Edge inference becomes relevant as models shrink enough to run on-device. Qualcomm, MediaTek, and Apple's custom silicon benefit from running AI locally rather than in the cloud.

What This Means for the Functional Index

Closelook tracks the training-to-inference ratio through the Compute layer of the Functional Index. As inference dominates, the weight of custom ASIC and inference-focused companies increases relative to pure GPU plays. The index adapts to reflect this structural shift.

Key Companies

NVDA
NVIDIA
Training dominant, adapting to inference
GOOG
Google (TPU)
Custom inference ASIC — TPU v5
AMZN
Amazon (Trainium)
Custom inference ASIC — Trainium2
QCOM
Qualcomm
Edge inference — on-device AI

Closelook View

The inference shift is one of the most important structural themes in AI investing. It doesn't mean NVIDIA loses — it means NVIDIA's dominance becomes less absolute. Portfolio implications: diversify compute exposure beyond pure NVIDIA, monitor custom ASIC adoption rates, and watch inference cost curves.

Functional Index — Compute Layer →AI Chip Buildout Dossier →6-Layer Model →

Related Entries

Framework6-Layer Model→ThemeCapEx Cliff→ThemeAgentic Disruption→FrameworkFunctional Index→FrameworkSentinel Tickers→

© 2026 Closelooknet · Thomas Look · Substack · LinkedIn · X

Not financial advice. All content is for informational and educational purposes. Past performance does not guarantee future results.

Privacy · Terms · Imprint