Table of Contents
Fetching ...

Controllable Latent Diffusion for Traffic Simulation

Yizhuo Xiao, Mustafa Suphi Erden, Cheng Wang

TL;DR

This work introduces controllable latent diffusion (CLD) to generate realistic and controllable driving scenarios for AV safety testing. By training a VAE to compress trajectories into a latent space and applying a diffusion model within that space, CLD leverages a reward-driven MDP to steer generation toward user-defined controllability while preserving realism. The approach integrates RL with diffusion sampling via importance-weighted policy updates, delivering diverse, safety-critical scenarios with improved collision avoidance and adherence to road rules, as demonstrated on the nuScenes dataset. The results suggest that reward-guided diffusion yields a favorable balance between realism and controllability, enhancing targeted safety evaluation for autonomous driving systems.

Abstract

The validation of autonomous driving systems benefits greatly from the ability to generate scenarios that are both realistic and precisely controllable. Conventional approaches, such as real-world test drives, are not only expensive but also lack the flexibility to capture targeted edge cases for thorough evaluation. To address these challenges, we propose a controllable latent diffusion that guides the training of diffusion models via reinforcement learning to automatically generate a diverse and controllable set of driving scenarios for virtual testing. Our approach removes the reliance on large-scale real-world data by generating complex scenarios whose properties can be finely tuned to challenge and assess autonomous vehicle systems. Experimental results show that our approach has the lowest collision rate of $0.098$ and lowest off-road rate of $0.096$, demonstrating superiority over existing baselines. The proposed approach significantly improves the realism, stability and controllability of the generated scenarios, enabling more nuanced safety evaluation of autonomous vehicles.

Controllable Latent Diffusion for Traffic Simulation

TL;DR

This work introduces controllable latent diffusion (CLD) to generate realistic and controllable driving scenarios for AV safety testing. By training a VAE to compress trajectories into a latent space and applying a diffusion model within that space, CLD leverages a reward-driven MDP to steer generation toward user-defined controllability while preserving realism. The approach integrates RL with diffusion sampling via importance-weighted policy updates, delivering diverse, safety-critical scenarios with improved collision avoidance and adherence to road rules, as demonstrated on the nuScenes dataset. The results suggest that reward-guided diffusion yields a favorable balance between realism and controllability, enhancing targeted safety evaluation for autonomous driving systems.

Abstract

The validation of autonomous driving systems benefits greatly from the ability to generate scenarios that are both realistic and precisely controllable. Conventional approaches, such as real-world test drives, are not only expensive but also lack the flexibility to capture targeted edge cases for thorough evaluation. To address these challenges, we propose a controllable latent diffusion that guides the training of diffusion models via reinforcement learning to automatically generate a diverse and controllable set of driving scenarios for virtual testing. Our approach removes the reliance on large-scale real-world data by generating complex scenarios whose properties can be finely tuned to challenge and assess autonomous vehicle systems. Experimental results show that our approach has the lowest collision rate of and lowest off-road rate of , demonstrating superiority over existing baselines. The proposed approach significantly improves the realism, stability and controllability of the generated scenarios, enabling more nuanced safety evaluation of autonomous vehicles.

Paper Structure

This paper contains 15 sections, 15 equations, 2 figures, 1 table, 1 algorithm.

Figures (2)

  • Figure 1: Overview of the CLD framework. A VAE is first trained to encode and reconstruct context and trajectories. Afterward, a DM is trained in the latent space to learn the distribution of the original data. Subsequently, the DM iteratively denoises the latent states $z^k$, which are then reconstructed to generate trajectories. These trajectories are evaluated by a reward module to obtain $\nabla J$ and further optimize the DM, guiding the generation toward controllable and safe traffic scenarios.
  • Figure 2: Generated trajectories in various traffic simulations by CLD. In each scenario, the red line indicates the trajectory of the ego agent, while other colors represent surrounding agents. The generated trajectories in complex intersections show collision-free and traffic rules compliant behaviors, demonstrating the robust and adaptive performance of our approach across diverse driving conditions.