Training Adversarial yet Safe Agent to Characterize Safety Performance of Highly Automated Vehicles
Minghao Zhu, Anmol Sidhu, Keith A. Redmill
TL;DR
The paper tackles safety performance testing for black-box highly automated vehicles (HAV) by moving beyond fixed scenario libraries toward adversarial yet safe scenarios (AYSS) generated through reinforcement-learning–driven POV policies interacting with VUTs. It formalizes the problem under a multi-agent testing setting, defines a Safety Performance Characterization Policy (SPCP), and describes a pipeline to train POVs, generate AYSS, and compare VUT policies across their full ODD. The approach is demonstrated in two simulation case studies (one-lane car-following and two-lane cut-in) showing that AYSS can reveal differences in aggressiveness and safety outcomes while enabling fair comparison via identical safety objectives. The framework offers a scalable, black-box evaluation method for HAV safety that can be extended to higher automation levels and multiple VUTs, with potential for accelerated testing through rapid POV training.
Abstract
This paper focuses on safety performance testing and characterization of black-box highly automated vehicles (HAV). Existing testing approaches typically obtain the testing outcomes by deploying the HAV into a specific testing environment. Such a testing environment can involve various passively given testing strategies presented by other traffic participants such as (i) the naturalistic driving policy learned from human drivers, (ii) extracted concrete scenarios from real-world driving data, and (iii) model-based or data-driven adversarial testing methodologies focusing on forcing safety-critical events. The safety performance of HAV is further characterized by analyzing the obtained testing outcomes with a particular selected measure, such as the observed collision risk. The aforementioned testing practices suffer from the scarcity of safety-critical events, have limited operational design domain (ODD) coverage, or are biased toward long-tail unsafe cases. This paper presents a novel and informative testing strategy that differs from these existing practices. The proposal is inspired by the intuition that a relatively safer HAV driving policy would allow the traffic vehicles to exhibit a higher level of aggressiveness to achieve a certain fixed level of an overall safe outcome. One can specifically characterize such a HAV and traffic interactive strategy and use it as a safety performance indicator for the HAV. Under the proposed testing scheme, the HAV is evaluated under its full ODD with a reward function that represents a trade-off between safety and adversity in generating safety-critical events. The proposed methodology is demonstrated in simulation with various HAV designs under different operational design domains.
