REPORT2025-04-09·Lightmatter

Lightmatter: A New Kind of Computer — Photonic Processor Running Production Transformers

Lightmatter

COMPILED NOTES

First photonic processor to run unmodified transformers + CNNs + RL at near-32-bit-float accuracy — no fine-tuning, no quantization-aware training. 65.5 TOPS (ABFP16) at 78W electrical + 1.6W optical, with 3D-integrated six-chip design using ~1M photonic components

Lightmatter: A New Kind of Computer

Lead

Lightmatter demonstrated a hybrid photonic-electronic processor capable of running production-grade neural networks without modification — no fine-tuning, no quantization-aware training, no architecture changes. Achieves accuracy "approaching 32-bit floating-point digital systems" on transformers (BERT), CNNs (ResNet classification + segmentation), and Atari reinforcement learning. Six 3D-integrated chips, ~1 million photonic components, 65.5 TOPS at 78W electrical + 1.6W optical.

Key Claims

Runs unmodified transformers + CNNs + RL — standard BERT, standard ResNet, standard Atari DQN; no model surgery.
Near-32-bit-float accuracy out of the box — the threshold that defines "production-ready" vs "research demo."
65.5 trillion ABFP16 operations/second.
78W electrical + 1.6W optical total power.
~1M photonic components across six 3D-integrated chips.
PyTorch + TensorFlow compatible — works with existing ML frameworks.

Architectural Approach

Hybrid photonic-electronic — photonic tensor cores for matmul, electronic control + memory.
Silicon photonics — standard process, not exotic materials.
3D vertical integration across six chips.
Photonic tensor cores handle the compute-dense matmul; electronic layer handles what it does well.

Demonstrated Workloads

Workload	Task
BERT	Transformer inference
ResNet	Image classification
ResNet (segmentation)	Semantic segmentation
Deep RL	Atari game-playing

Novelty vs Prior Photonic Work

First complex/real-world workloads — most prior photonic demos used MNIST-scale toy problems.
Practical precision without simplified benchmarks — ABFP16 achieves near-FP32 accuracy without calibration tricks.
Full system integration — not isolated components on an optical bench.
Framework compatibility — PyTorch/TensorFlow work out of the box.

Why This Matters

Companion to the two 2026 Nature papers in this topic (photonic tensor processor and Ashtiani on-chip backprop). Lightmatter is the commercial endpoint of the research direction those papers establish: if you can run BERT without modification at near-FP32 accuracy, you have an inference-as-a-service offering that competes with GPU inference on throughput per watt, where photonics' ~20× energy advantage could justify significant silicon cost.

The critical gap is training — Lightmatter's chip is inference-only. Ashtiani et al. (Nokia Bell Labs) solve training on-chip but at toy scale. The 2027-2028 question is whether a production-scale photonic chip can also train, or whether inference-only remains the right product.

Limitations (Inferred)

Blog claims, not peer-reviewed data; specific accuracy numbers on each workload not disclosed.
Inference only; no training capability.
65.5 TOPS is modest vs GPU TFLOPS — competitive on energy/op, not raw throughput.
~1M photonic components is an engineering feat, but yield and reliability at scale not reported.

Source: A New Kind of Computer, Lightmatter, April 9 2025.

RELATED · IN THE BASE

Lead

Key Claims

Runs unmodified transformers + CNNs + RL — standard BERT, standard ResNet, standard Atari DQN; no model surgery.

Near-32-bit-float accuracy out of the box — the threshold that defines "production-ready" vs "research demo."

65.5 trillion ABFP16 operations/second.

78W electrical + 1.6W optical total power.

~1M photonic components across six 3D-integrated chips.

PyTorch + TensorFlow compatible — works with existing ML frameworks.

Workload

Task

BERT

Transformer inference

ResNet

Image classification

ResNet (segmentation)

Semantic segmentation

Deep RL

Atari game-playing

Novelty vs Prior Photonic Work

First complex/real-world workloads — most prior photonic demos used MNIST-scale toy problems.

Practical precision without simplified benchmarks — ABFP16 achieves near-FP32 accuracy without calibration tricks.

Full system integration — not isolated components on an optical bench.

Framework compatibility — PyTorch/TensorFlow work out of the box.

Why This Matters

Limitations (Inferred)

Blog claims, not peer-reviewed data; specific accuracy numbers on each workload not disclosed.

Inference only; no training capability.

65.5 TOPS is modest vs GPU TFLOPS — competitive on energy/op, not raw throughput.

~1M photonic components is an engineering feat, but yield and reliability at scale not reported.

Source: A New Kind of Computer, Lightmatter, April 9 2025.