The Role of World Models in Shaping Autonomous Driving: A Comprehensive Survey
Sifan Tu, Xin Zhou, Dingkang Liang, Xingyu Jiang, Yumeng Zhang, Xiaofan Li, Xiang Bai
TL;DR
This survey assesses Driving World Models (DWM) for autonomous driving, focusing on predicting scene evolution from historical observations and actions across 2D, 3D, and scene-free paradigms. It systematically categorizes methods by predicted modalities, reviews core architectures (diffusion, transformers, latent-state models), and outlines applications in simulation, data generation, anticipative driving, and 4D pre-training. The authors compile high-impact datasets and task-specific metrics, discuss limitations (data scarcity, efficiency, reliability, multi-sensor fusion, and adversarial risks), and propose future directions such as unified tasks and language-assisted supervision. The goal is to clarify progress, identify gaps, and guide researchers toward broader, safer adoption of DWM in real-world autonomous driving.
Abstract
Driving World Model (DWM), which focuses on predicting scene evolution during the driving process, has emerged as a promising paradigm in pursuing autonomous driving. These methods enable autonomous driving systems to better perceive, understand, and interact with dynamic driving environments. In this survey, we provide a comprehensive overview of the latest progress in DWM. We categorize existing approaches based on the modalities of the predicted scenes and summarize their specific contributions to autonomous driving. In addition, high-impact datasets and various metrics tailored to different tasks within the scope of DWM research are reviewed. Finally, we discuss the potential limitations of current research and propose future directions. This survey provides valuable insights into the development and application of DWM, fostering its broader adoption in autonomous driving. The relevant papers are collected at https://github.com/LMD0311/Awesome-World-Model.
