Safety-Critical Scenario Generation Via Reinforcement Learning Based Editing
Haolan Liu, Liangjun Zhang, Siva Kumar Sastry Hari, Jishen Zhao
TL;DR
This work tackles the long-tail problem in autonomous vehicle safety by introducing a reinforcement-learning-based scenario editor that sequentially edits driving scenarios via actions like adding agents or perturbing trajectories. It combines an anchor-based risk score with a learned plausibility model (CVAE/autoregressive components) to generate challenging yet realistic safety-critical scenarios, trained with PPO. The framework supports flexible, high-dimensional scenario representations and outperforms prior methods in generating high-quality, diverse risk scenarios while maintaining plausibility. Empirical results in highway-env and Argoverse-informed settings show improved efficiency over black-box optimization and stronger realism compared to baselines, offering a practical tool for AV safety validation and testing.
Abstract
Generating safety-critical scenarios is essential for testing and verifying the safety of autonomous vehicles. Traditional optimization techniques suffer from the curse of dimensionality and limit the search space to fixed parameter spaces. To address these challenges, we propose a deep reinforcement learning approach that generates scenarios by sequential editing, such as adding new agents or modifying the trajectories of the existing agents. Our framework employs a reward function consisting of both risk and plausibility objectives. The plausibility objective leverages generative models, such as a variational autoencoder, to learn the likelihood of the generated parameters from the training datasets; It penalizes the generation of unlikely scenarios. Our approach overcomes the dimensionality challenge and explores a wide range of safety-critical scenarios. Our evaluation demonstrates that the proposed method generates safety-critical scenarios of higher quality compared with previous approaches.
