Evaluation of automated driving system safety metrics with logged vehicle trajectory data
Xintao Yan, Shuo Feng, David J. LeBlanc, Carol Flannagan, Henry X. Liu
TL;DR
This work tackles the problem of fairly evaluating real-time autonomous driving safety metrics despite differing behavioral assumptions. It introduces a logged-trajectory evaluation framework that identifies ground-truth collision-unavoidable moments by solving an SV evasive-trajectory optimization given near-future BV trajectories, thereby eliminating BV-prediction errors. Using a large-scale SUMO-generated dataset and ROC/PR analyses, the study compares TTC, PCM, and MPrISM, revealing that MPrISM offers the highest recall and ROC performance while PCM provides higher precision, with TTC lagging in complex maneuvers. A real-world crash demonstration in Ann Arbor further validates the framework’s applicability. Overall, the approach provides objective, scalable means to characterize metric performance and informs metric selection and tuning for ADS safety.
Abstract
Real-time safety metrics are important for the automated driving system (ADS) to assess the risk of driving situations and to assist the decision-making. Although a number of real-time safety metrics have been proposed in the literature, systematic performance evaluation of these safety metrics has been lacking. As different behavioral assumptions are adopted in different safety metrics, it is difficult to compare the safety metrics and evaluate their performance. To overcome this challenge, in this study, we propose an evaluation framework utilizing logged vehicle trajectory data, in that vehicle trajectories for both subject vehicle (SV) and background vehicles (BVs) are obtained and the prediction errors caused by behavioral assumptions can be eliminated. Specifically, we examine whether the SV is in a collision unavoidable situation at each moment, given all near-future trajectories of BVs. In this way, we level the ground for a fair comparison of different safety metrics, as a good safety metric should always alarm in advance to the collision unavoidable moment. When trajectory data from a large number of trips are available, we can systematically evaluate and compare different metrics' statistical performance. In the case study, three representative real-time safety metrics, including the time-to-collision (TTC), the PEGASUS Criticality Metric (PCM), and the Model Predictive Instantaneous Safety Metric (MPrISM), are evaluated using a large-scale simulated trajectory dataset. The proposed evaluation framework is important for researchers, practitioners, and regulators to characterize different metrics, and to select appropriate metrics for different applications. Moreover, by conducting failure analysis on moments when a safety metric failed, we can identify its potential weaknesses which are valuable for its potential refinements and improvements.
