Hardware & Computing

Chip architecture, GPUs, quantum computing, custom silicon, semiconductor supply chains

2Sources

4Concepts

2Entities

Hardware & Computing

AI compute infrastructure in early 2026 is defined by a paradigm shift: the unit of compute is no longer the GPU chip but the rack. NVIDIA's Vera Rubin platform — six co-designed chips delivering 50 PFLOPS FP4 inference per GPU, 288 GB HBM4 at 22 TB/s, and 260 TB/s aggregate bandwidth across a 72-GPU rack — sets the new performance ceiling. Cableless modular trays, co-packaged silicon photonics, and liquid cooling at 180-220 kW per rack represent engineering at the extreme edge of what's deployable.

Meanwhile, the custom silicon insurgency is real but more nuanced than headlines suggest. Hyperscaler ASICs (Google TPU v7, Amazon Trainium 3, Microsoft Maia 200, Meta MTIA) are growing at 44.6% CAGR versus 16.1% for GPUs. NVIDIA's inference share is projected to drop from 90% to 20-30% by 2028 — but the total market expands so fast that absolute revenue grows. The GPU monopoly isn't ending; it's evolving from chip-level to system-level lock-in, with open-source inference stacks (Triton, vLLM) as the key wildcard that could shift the balance.

Frontier — What's Moving Now

Rack-scale co-design — NVIDIA NVL72: 72 GPUs, 260 TB/s, cableless modular trays. The rack is the product.
HBM4 deployed — 22 TB/s bandwidth (2.8x over HBM3e), 288 GB capacity. Memory bandwidth bottleneck easing.
Custom ASIC growth at 44.6% CAGR — TPU v7, Trainium 3, Maia 200, MTIA all advancing. Inference long-tail shifting to ASICs.
Silicon photonics in production — Spectrum-6 at 102.4 Tb/s, 64x signal integrity improvement. Enables rack-scale bandwidth.
NVIDIA inference share projected to drop to 20-30% by 2028 — But system-level lock-in deepens via rack co-design.

Concept Map

Concepts

Concept	Sources	Evidence	Frontier	Last Updated
Rack-Scale AI Compute	2 (tech report + analysis)	Strong	Active	2026-04-09
HBM4 Memory Architecture	2 (tech report + analysis)	Strong	Active	2026-04-09
Custom Silicon vs GPU	2 (analysis + tech report)	Strong	Active	2026-04-09
Silicon Photonics	1 (tech report)	Strong	Active	2026-04-09

Entities

Entity	Type	Sources	Key Connection
NVIDIA	Company	2	Vera Rubin platform, rack-as-product co-design strategy
Vera Rubin	Product	2	6-chip AI supercomputer, 50 PFLOPS FP4, H2 2026 shipping

Timeline

See timeline.md for chronological developments (January 2026 through H2 2026).

Research Frontier

See frontier.md for active research directions, breakthroughs, and knowledge gaps.

Sources

#	Title	Type	Date	Status
1	Inside the NVIDIA Vera Rubin Platform	tech report	2026-01-05	compiled
2	Custom Silicon Inflection 2026	analysis	2026-02-25	compiled