Beyond ADE and FDE: A Comprehensive Evaluation Framework for Safety-Critical Prediction in Multi-Agent Autonomous Driving Scenarios

Feifei Liu; Haozhe Wang; Zejun Wei; Qirong Lu; Yiyang Wen; Xiaoyu Tang; Jingyan Jiang; Zhijian He

Beyond ADE and FDE: A Comprehensive Evaluation Framework for Safety-Critical Prediction in Multi-Agent Autonomous Driving Scenarios

Feifei Liu, Haozhe Wang, Zejun Wei, Qirong Lu, Yiyang Wen, Xiaoyu Tang, Jingyan Jiang, Zhijian He

TL;DR

The paper addresses the insufficiency of ADE/FDE in capturing safety-critical and interactive dynamics in autonomous driving. It introduces a three-layer evaluation framework operating over semantic information, agent density, and road geometry, quantified by the Map Information Effectiveness metric $MIE = \frac{\text{Error}_{o} - \text{Error}_{w}}{\sqrt{\text{Error}_{o}}}$, to test predictions under map-free and map-rich conditions. Using nuScenes and AgentFormer as a baseline, the experiments reveal pronounced map dependency and safety-critical failure modes that traditional metrics overlook, especially in high-density and curved-road scenarios. These results establish scenario-aware validation as essential for developing robust, certifiable trajectory predictors for autonomous vehicles.

Abstract

Current evaluation methods for autonomous driving prediction models rely heavily on simplistic metrics such as Average Displacement Error (ADE) and Final Displacement Error (FDE). While these metrics offer basic performance assessments, they fail to capture the nuanced behavior of prediction modules under complex, interactive, and safety-critical driving scenarios. For instance, existing benchmarks do not distinguish the influence of nearby versus distant agents, nor systematically test model robustness across varying multi-agent interactions. This paper addresses this critical gap by proposing a novel testing framework that evaluates prediction performance under diverse scene structures, saying, map context, agent density and spatial distribution. Through extensive empirical analysis, we quantify the differential impact of agent proximity on target trajectory prediction and identify scenario-specific failure cases that are not exposed by traditional metrics. Our findings highlight key vulnerabilities in current state-of-the-art prediction models and demonstrate the importance of scenario-aware evaluation. The proposed framework lays the groundwork for rigorous, safety-driven prediction validation, contributing significantly to the identification of failure-prone corner cases and the development of robust, certifiable prediction systems for autonomous vehicles.

Beyond ADE and FDE: A Comprehensive Evaluation Framework for Safety-Critical Prediction in Multi-Agent Autonomous Driving Scenarios

TL;DR

Abstract

Beyond ADE and FDE: A Comprehensive Evaluation Framework for Safety-Critical Prediction in Multi-Agent Autonomous Driving Scenarios

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)