Table of Contents
Fetching ...

World Models for Autonomous Driving: An Initial Survey

Yanchen Guan, Haicheng Liao, Zhenning Li, Jia Hu, Runze Yuan, Yunjian Li, Guohui Zhang, Chengzhong Xu

TL;DR

The paper surveys world models for autonomous driving, addressing the need to predict futures under uncertainty and to support safe, efficient decision-making. It traces the evolution from control-theoretic beginnings to neural latent-dynamics frameworks such as RSSM and JEPA, and discusses training via the evidence lower bound (ELBO) alongside energy-based formulations in JEPA. It catalogs driving-specific applications, including Driving Scenario Generation (GAIA-1, DriveDreamer) and Planning/Control (MILE, SEM2, Drive-WM, UniWorld), with multi-modal inputs and 3D representations like occupancy grids. It outlines key challenges—long-term memory integration, sim-to-real generalization, and ethical/safety considerations—and envisions future directions toward cognitive co-piloting and harmonizing driving with urban ecosystems, underscoring potential gains in safety, data efficiency, and scalable simulation for real-world deployment, all framed within a formal modeling paradigm that uses latent dynamics and probabilistic reasoning such as $ ext{ELBO}$ and $ ext{E}_w(oldsymbol{ heta})$ to optimize predictions.

Abstract

In the rapidly evolving landscape of autonomous driving, the capability to accurately predict future events and assess their implications is paramount for both safety and efficiency, critically aiding the decision-making process. World models have emerged as a transformative approach, enabling autonomous driving systems to synthesize and interpret vast amounts of sensor data, thereby predicting potential future scenarios and compensating for information gaps. This paper provides an initial review of the current state and prospective advancements of world models in autonomous driving, spanning their theoretical underpinnings, practical applications, and the ongoing research efforts aimed at overcoming existing limitations. Highlighting the significant role of world models in advancing autonomous driving technologies, this survey aspires to serve as a foundational reference for the research community, facilitating swift access to and comprehension of this burgeoning field, and inspiring continued innovation and exploration.

World Models for Autonomous Driving: An Initial Survey

TL;DR

The paper surveys world models for autonomous driving, addressing the need to predict futures under uncertainty and to support safe, efficient decision-making. It traces the evolution from control-theoretic beginnings to neural latent-dynamics frameworks such as RSSM and JEPA, and discusses training via the evidence lower bound (ELBO) alongside energy-based formulations in JEPA. It catalogs driving-specific applications, including Driving Scenario Generation (GAIA-1, DriveDreamer) and Planning/Control (MILE, SEM2, Drive-WM, UniWorld), with multi-modal inputs and 3D representations like occupancy grids. It outlines key challenges—long-term memory integration, sim-to-real generalization, and ethical/safety considerations—and envisions future directions toward cognitive co-piloting and harmonizing driving with urban ecosystems, underscoring potential gains in safety, data efficiency, and scalable simulation for real-world deployment, all framed within a formal modeling paradigm that uses latent dynamics and probabilistic reasoning such as and to optimize predictions.

Abstract

In the rapidly evolving landscape of autonomous driving, the capability to accurately predict future events and assess their implications is paramount for both safety and efficiency, critically aiding the decision-making process. World models have emerged as a transformative approach, enabling autonomous driving systems to synthesize and interpret vast amounts of sensor data, thereby predicting potential future scenarios and compensating for information gaps. This paper provides an initial review of the current state and prospective advancements of world models in autonomous driving, spanning their theoretical underpinnings, practical applications, and the ongoing research efforts aimed at overcoming existing limitations. Highlighting the significant role of world models in advancing autonomous driving technologies, this survey aspires to serve as a foundational reference for the research community, facilitating swift access to and comprehension of this burgeoning field, and inspiring continued innovation and exploration.
Paper Structure (25 sections, 9 equations, 6 figures, 1 table)

This paper contains 25 sections, 9 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Number of Publications related to World Models since 2015. (Data sources: Web of Science Core Collection and Preprint Citation Index. Key words: "world model", "world models", "reinforcement learning".)
  • Figure 2: Diagram of an Agent's World Model Framework.
  • Figure 3: Comparative Schematic of RNN, SSM, and RSSM Architectures in Latent Dynamics Modeling.
  • Figure 4: Comparative Schematic of Joint-Embedding Architecture, Generative Architecture, and Joint-Embedding Predictive Architecture.
  • Figure 5: World Models in Autonomous Driving Pipelines.
  • ...and 1 more figures