Dream to Drive with Predictive Individual World Model
Yinfeng Gao, Qichao Zhang, Da-wei Ding, Dongbin Zhao
TL;DR
This work tackles reactive decision-making in urban autonomous driving where other road users’ intentions are unknown. It introduces Predictive Individual World Model (PIWM), which encodes driving scenes at an individual-vehicle level with branched encoders and self-attention to model interactions, and learns intention-aware latent states through a trajectory-prediction objective. A separate behavior model is trained within the world model’s imagination, using cross-attention to fuse ego and nearby vehicle states for discrete speed control. Empirical results on INTERACTION-derived scenarios show that PIWM improves learning efficiency and outperforms both model-free baselines and DreamerV3 across small- and large-scale benchmarks, with ablations confirming the value of individual modeling and interactive prediction. The approach enhances interpretability by decoding predicted trajectories and paves the way for scalable, intention-aware autonomous driving in complex traffic.
Abstract
It is still a challenging topic to make reactive driving behaviors in complex urban environments as road users' intentions are unknown. Model-based reinforcement learning (MBRL) offers great potential to learn a reactive policy by constructing a world model that can provide informative states and imagination training. However, a critical limitation in relevant research lies in the scene-level reconstruction representation learning, which may overlook key interactive vehicles and hardly model the interactive features among vehicles and their long-term intentions. Therefore, this paper presents a novel MBRL method with a predictive individual world model (PIWM) for autonomous driving. PIWM describes the driving environment from an individual-level perspective and captures vehicles' interactive relations and their intentions via trajectory prediction task. Meanwhile, a behavior policy is learned jointly with PIWM. It is trained in PIWM's imagination and effectively navigates in the urban driving scenes leveraging intention-aware latent states. The proposed method is trained and evaluated on simulation environments built upon real-world challenging interactive scenarios. Compared with popular model-free and state-of-the-art model-based reinforcement learning methods, experimental results show that the proposed method achieves the best performance in terms of safety and efficiency.
