SKYLIGHT: A Scalable Hundred-Channel 3D Photonic In-Memory Tensor Core for Real-Time AI Inference
SKYLIGHT proposes scalable 3D, WDM-enabled, non-volatile photonic tensor core architecture with hundred-channel parallelism for real-time AI inference
SKYLIGHT: A Scalable Hundred-Channel 3D Photonic In-Memory Tensor Core
Abstract
The SKYLIGHT architecture (arXiv 2602.19031, February 2026) is a 3D, WDM-enabled, non-volatile photonic tensor core for real-time AI inference. It scales to hundred-channel parallelism by co-designing topology, wavelength routing, accumulation, and programming in a 3D stack — pushing photonic in-memory compute toward production-scale inference throughput.
Key Contributions
- 3D photonic stack: extends photonic compute from 2D to 3D — increases parallelism without proportional area scaling.
- Hundred-channel WDM: wavelength-division multiplexing at scale increases throughput per chip.
- Non-volatile photonic memory: weights stored in photonic non-volatile elements; eliminates electrical reload latency.
- Co-designed system: topology + wavelength routing + accumulation + programming integrated, not separately optimized.
Methodology
The paper presents architectural co-design across photonic primitives (microring resonators, MZI meshes, phase-change-material non-volatile elements), WDM routing fabric, and programming interface. Simulation + small-scale tape-out validation reported.
Results
- Scale: 100+ wavelength channels per tensor core unit.
- Real-time inference throughput projections at competitive performance per watt vs digital accelerators.
- Validation through circuit-level simulation + reduced-scale fabrication.
Limitations
- Real-time inference claims are throughput projections from simulation; full-scale fabrication validation pending.
- Yield and manufacturability at hundred-channel scale not fully addressed.
- Non-volatile photonic elements have limited write endurance compared to SRAM/DRAM.
Full Content
SKYLIGHT addresses one of photonic computing's chronic challenges: scaling from research demos to production-relevant throughput. The 3D + WDM combination is the key architectural insight — 2D photonic chips have hit limits on per-area parallelism, and WDM provides a multiplicative parallelism axis without proportional area cost.
This connects to the broader 2026 photonic-compute pattern: imec photonic tensor processor (Nature Comms, January 2026, PyTorch-integrated rack unit), Ashtiani et al. on-chip backprop (Nature, March 2026), Lightmatter production transformers (2025). SKYLIGHT extends the production-substrate transition by addressing scaling.
For frontier AI inference (LLM serving, multimodal model inference), the relevant comparison is throughput-per-watt against H100/H200/B200 deployments. SKYLIGHT's projections are competitive on paper; production validation in 2027-2028 will determine whether photonic in-memory inference becomes a credible deployment substrate.
Source: arXiv 2602.19031 — SKYLIGHT: A Scalable Hundred-Channel 3D Photonic In-Memory Tensor Core Architecture for Real-time AI Inference, February 2026