Learning from Risk: LLM-Guided Generation of Safety-Critical Scenarios with Prior Knowledge
Yuhang Wang, Heye Huang, Zhenhua Xu, Kailai Sun, Baoshen Guo, Jinhua Zhao
TL;DR
Learning from Risk addresses the scarcity of rare, safety-critical events in autonomous driving validation by fusing data-driven motion priors with knowledge-guided optimization. The framework combines a CVAE-GNN to learn latent traffic dynamics from highD/nuScenes with an LLM that parses scene descriptions into adaptive loss terms to steer generation across risk levels. It demonstrates in CARLA and SMARTS that this approach substantially increases long-tail event coverage while preserving realism and sim-to-real fidelity, exposing ADS to more challenging interactions than existing baselines. The work offers a principled pathway for safety validation and stress-testing of autonomous systems under rare but consequential events.
Abstract
Autonomous driving faces critical challenges in rare long-tail events and complex multi-agent interactions, which are scarce in real-world data yet essential for robust safety validation. This paper presents a high-fidelity scenario generation framework that integrates a conditional variational autoencoder (CVAE) with a large language model (LLM). The CVAE encodes historical trajectories and map information from large-scale naturalistic datasets to learn latent traffic structures, enabling the generation of physically consistent base scenarios. Building on this, the LLM acts as an adversarial reasoning engine, parsing unstructured scene descriptions into domain-specific loss functions and dynamically guiding scenario generation across varying risk levels. This knowledge-driven optimization balances realism with controllability, ensuring that generated scenarios remain both plausible and risk-sensitive. Extensive experiments in CARLA and SMARTS demonstrate that our framework substantially increases the coverage of high-risk and long-tail events, improves consistency between simulated and real-world traffic distributions, and exposes autonomous driving systems to interactions that are significantly more challenging than those produced by existing rule- or data-driven methods. These results establish a new pathway for safety validation, enabling principled stress-testing of autonomous systems under rare but consequential events.
