PAPER2025-01-20·Multiple·arXiv 2501.11260

A Survey of World Models for Autonomous Driving

Multiple authors

COMPILED NOTES

Three-tiered taxonomy: future-world generation, behavior planning, integrated closed-loop systems

A Survey of World Models for Autonomous Driving

Key Claims (from abstract/summary)

Three-tiered taxonomy for driving-focused world models:

Tier 1 — Generation of Future Physical World:

Image-based generation
BEV (bird's-eye-view) based generation
OG (occupancy grid) based generation
PC (point cloud) based generation
Methods enhance scene evolution via diffusion models and 4D occupancy forecasting

Tier 2 — Behavior Planning for Intelligent Agents:

Rule-driven paradigms
Learning-based paradigms
Cost-map optimization
Reinforcement learning for trajectory generation in complex traffic

Tier 3 — Integrated closed-loop systems combining both.

Key Players Referenced

The driving world-model space includes Wayve (GAIA series), NVIDIA (Cosmos), Tesla's internal world models, MILE/DriveDreamer academic systems — this survey organizes the lineage.

Why This Matters

Autonomous driving is the highest-stakes commercial deployment of world models today. Every major AV company is running some form of world model internally — they're used for sim-to-real, counterfactual evaluation, and data augmentation. The driving vertical will force answers to the pixel-vs-latent debate before the robotics vertical does, because the training budgets are larger.

Notes

Full paper not yet deeply ingested — first-pass stub from search results. Flag for deeper read when we start producing commercial-layer coverage for the autonomous driving market structure.

Source: A Survey of World Models for Autonomous Driving

RELATED · IN THE BASE