Hardware & Computing
Download ReportChip architecture, GPUs, quantum computing, custom silicon, semiconductor supply chains
Hardware & Computing
AI compute infrastructure in early 2026 is defined by a paradigm shift: the unit of compute is no longer the GPU chip but the rack. NVIDIA's Vera Rubin platform — six co-designed chips delivering 50 PFLOPS FP4 inference per GPU, 288 GB HBM4 at 22 TB/s, and 260 TB/s aggregate bandwidth across a 72-GPU rack — sets the new performance ceiling. Cableless modular trays, co-packaged silicon photonics, and liquid cooling at 180-220 kW per rack represent engineering at the extreme edge of what's deployable.
Meanwhile, the custom silicon insurgency is real but more nuanced than headlines suggest. Hyperscaler ASICs (Google TPU v7, Amazon Trainium 3, Microsoft Maia 200, Meta MTIA) are growing at 44.6% CAGR versus 16.1% for GPUs. NVIDIA's inference share is projected to drop from 90% to 20-30% by 2028 — but the total market expands so fast that absolute revenue grows. The GPU monopoly isn't ending; it's evolving from chip-level to system-level lock-in, with open-source inference stacks (Triton, vLLM) as the key wildcard that could shift the balance.
Frontier — What's Moving Now
- Rack-scale co-design — NVIDIA NVL72: 72 GPUs, 260 TB/s, cableless modular trays. The rack is the product.
- HBM4 deployed — 22 TB/s bandwidth (2.8x over HBM3e), 288 GB capacity. Memory bandwidth bottleneck easing.
- Custom ASIC growth at 44.6% CAGR — TPU v7, Trainium 3, Maia 200, MTIA all advancing. Inference long-tail shifting to ASICs.
- Silicon photonics in production — Spectrum-6 at 102.4 Tb/s, 64x signal integrity improvement. Enables rack-scale bandwidth.
- NVIDIA inference share projected to drop to 20-30% by 2028 — But system-level lock-in deepens via rack co-design.
Concept Map
Concepts
| Concept | Sources | Evidence | Frontier | Last Updated |
|---|---|---|---|---|
| Rack-Scale AI Compute | 2 (tech report + analysis) | Strong | Active | 2026-04-09 |
| HBM4 Memory Architecture | 2 (tech report + analysis) | Strong | Active | 2026-04-09 |
| Custom Silicon vs GPU | 2 (analysis + tech report) | Strong | Active | 2026-04-09 |
| Silicon Photonics | 1 (tech report) | Strong | Active | 2026-04-09 |
Entities
| Entity | Type | Sources | Key Connection |
|---|---|---|---|
| NVIDIA | Company | 2 | Vera Rubin platform, rack-as-product co-design strategy |
| Vera Rubin | Product | 2 | 6-chip AI supercomputer, 50 PFLOPS FP4, H2 2026 shipping |
Timeline
See timeline.md for chronological developments (January 2026 through H2 2026).
Research Frontier
See frontier.md for active research directions, breakthroughs, and knowledge gaps.
Sources
| # | Title | Type | Date | Status |
|---|---|---|---|---|
| 1 | Inside the NVIDIA Vera Rubin Platform | tech report | 2026-01-05 | compiled |
| 2 | Custom Silicon Inflection 2026 | analysis | 2026-02-25 | compiled |