Controllable risk scenario generation from human crash data for autonomous vehicle testing
Qiujing Lu, Xuanhan Wang, Runze Yuan, Wei Lu, Xinyi Gong, Shuo Feng
TL;DR
The paper addresses the challenge of testing autonomous vehicles under rare safety-critical conditions by unifying nominal driving behavior with risk-prone behavior. It introduces CRAG, a framework that learns a structured latent risk space from crash data via a variational autoencoder and uses optimization in latent space to generate controllable risk scenarios within closed-loop simulations. By fusing a risk-aware latent representation with optimization-guided state transitions, CRAG achieves diverse, realistic, and targeted safety-critical scenarios while maintaining fidelity to real-world crash patterns. Experimental results demonstrate improved diversity and controllability in both one-way and intersection scenarios, enabling scalable, targeted evaluation of AV robustness.
Abstract
Ensuring the safety of autonomous vehicles (AV) requires rigorous testing under both everyday driving and rare, safety-critical conditions. A key challenge lies in simulating environment agents, including background vehicles (BVs) and vulnerable road users (VRUs), that behave realistically in nominal traffic while also exhibiting risk-prone behaviors consistent with real-world accidents. We introduce Controllable Risk Agent Generation (CRAG), a framework designed to unify the modeling of dominant nominal behaviors and rare safety-critical behaviors. CRAG constructs a structured latent space that disentangles normal and risk-related behaviors, enabling efficient use of limited crash data. By combining risk-aware latent representations with optimization-based mode-transition mechanisms, the framework allows agents to shift smoothly and plausibly from safe to risk states over extended horizons, while maintaining high fidelity in both regimes. Extensive experiments show that CRAG improves diversity compared to existing baselines, while also enabling controllable generation of risk scenarios for targeted and efficient evaluation of AV robustness.
