Criticality Metrics for Relevance Classification in Safety Evaluation of Object Detection in Automated Driving
Jörg Gamerdinger, Sven Teufel, Stephan Amann, Oliver Bringmann
TL;DR
The paper tackles the challenge of safety-oriented evaluation of object detection in automated driving by surveying a wide range of criticality metrics, demonstrating their limitations on safety-critical scenarios, and proposing two strategies—multi-metric aggregation and bidirectional criticality rating—to improve robustness. Using the DeepAccident dataset, it shows that single metrics often fail to reliably identify risk frames, while the proposed approaches yield substantial gains in correctly identifying safety-critical frames and in timely signaling before collisions. The work highlights the need for context-aware, generalizable metrics and offers concrete guidance on combining metrics to enhance offline perception-safety evaluation, with an open-source library planned for future release. Overall, the study advances practical safety evaluation by shifting from single-metric reliance to integrated, bidirectional, and scenario-aware assessments of criticality.
Abstract
Ensuring safety is the primary objective of automated driving, which necessitates a comprehensive and accurate perception of the environment. While numerous performance evaluation metrics exist for assessing perception capabilities, incorporating safety-specific metrics is essential to reliably evaluate object detection systems. A key component for safety evaluation is the ability to distinguish between relevant and non-relevant objects - a challenge addressed by criticality or relevance metrics. This paper presents the first in-depth analysis of criticality metrics for safety evaluation of object detection systems. Through a comprehensive review of existing literature, we identify and assess a range of applicable metrics. Their effectiveness is empirically validated using the DeepAccident dataset, which features a variety of safety-critical scenarios. To enhance evaluation accuracy, we propose two novel application strategies: bidirectional criticality rating and multi-metric aggregation. Our approach demonstrates up to a 100% improvement in terms of criticality classification accuracy, highlighting its potential to significantly advance the safety evaluation of object detection systems in automated vehicles.
