DiffAD: A Unified Diffusion Modeling Approach for Autonomous Driving
Tao Wang, Cong Zhang, Xingguang Qu, Kun Li, Weiwei Liu, Chang Huang
TL;DR
DiffAD reframes end-to-end autonomous driving as conditional image generation on a unified rasterized BEV, addressing coordination and complexity issues in prior modular and E2E pipelines. It introduces a latent diffusion framework with AdaLN-conditioned denoising, a three-canvas BEV representation, and a Trajectory Extraction Network to jointly learn perception, prediction, and planning. The approach yields state-of-the-art closed-loop performance on CARLA Bench2Drive, with ablations confirming the benefits of joint task optimization, denoising progression, and modality fusion. This work demonstrates the potential of diffusion-based generative modeling to simplify autonomous driving architectures while improving robustness and planning coherence.
Abstract
End-to-end autonomous driving (E2E-AD) has rapidly emerged as a promising approach toward achieving full autonomy. However, existing E2E-AD systems typically adopt a traditional multi-task framework, addressing perception, prediction, and planning tasks through separate task-specific heads. Despite being trained in a fully differentiable manner, they still encounter issues with task coordination, and the system complexity remains high. In this work, we introduce DiffAD, a novel diffusion probabilistic model that redefines autonomous driving as a conditional image generation task. By rasterizing heterogeneous targets onto a unified bird's-eye view (BEV) and modeling their latent distribution, DiffAD unifies various driving objectives and jointly optimizes all driving tasks in a single framework, significantly reducing system complexity and harmonizing task coordination. The reverse process iteratively refines the generated BEV image, resulting in more robust and realistic driving behaviors. Closed-loop evaluations in Carla demonstrate the superiority of the proposed method, achieving a new state-of-the-art Success Rate and Driving Score.
