Table of Contents
Fetching ...

Safety-Aligned 3D Object Detection: Single-Vehicle, Cooperative, and End-to-End Perspectives

Brian Hsuan-Cheng Liao, Chih-Hong Cheng, Hasan Esen, Alois Knoll

Abstract

Perception plays a central role in connected and autonomous vehicles (CAVs), underpinning not only conventional modular driving stacks, but also cooperative perception systems and recent end-to-end driving models. While deep learning has greatly improved perception performance, its statistical nature makes perfect predictions difficult to attain. Meanwhile, standard training objectives and evaluation benchmarks treat all perception errors equally, even though only a subset is safety-critical. In this paper, we investigate safety-aligned evaluation and optimization for 3D object detection that explicitly characterize high-impact errors. Building on our previously proposed safety-oriented metric, NDS-USC, and safety-aware loss function, EC-IoU, we make three contributions. First, we present an expanded study of single-vehicle 3D object detection models across diverse neural network architectures and sensing modalities, showing that gains under standard metrics such as mAP and NDS may not translate to safety-oriented criteria represented by NDS-USC. With EC-IoU, we reaffirm the benefit of safety-aware fine-tuning for improving safety-critical detection performance. Second, we conduct an ego-centric, safety-oriented evaluation of AV-infrastructure cooperative object detection models, underscoring its superiority over vehicle-only models and demonstrating a safety impact analysis that illustrates the potential contribution of cooperative models to "Vision Zero." Third, we integrate EC-IoU into SparseDrive and show that safety-aware perception hardening can reduce collision rate by nearly 30% and improve system-level safety directly in an end-to-end perception-to-planning framework. Overall, our results indicate that safety-aligned perception evaluation and optimization offer a practical path toward enhancing CAV safety across single-vehicle, cooperative, and end-to-end autonomy settings.

Safety-Aligned 3D Object Detection: Single-Vehicle, Cooperative, and End-to-End Perspectives

Abstract

Perception plays a central role in connected and autonomous vehicles (CAVs), underpinning not only conventional modular driving stacks, but also cooperative perception systems and recent end-to-end driving models. While deep learning has greatly improved perception performance, its statistical nature makes perfect predictions difficult to attain. Meanwhile, standard training objectives and evaluation benchmarks treat all perception errors equally, even though only a subset is safety-critical. In this paper, we investigate safety-aligned evaluation and optimization for 3D object detection that explicitly characterize high-impact errors. Building on our previously proposed safety-oriented metric, NDS-USC, and safety-aware loss function, EC-IoU, we make three contributions. First, we present an expanded study of single-vehicle 3D object detection models across diverse neural network architectures and sensing modalities, showing that gains under standard metrics such as mAP and NDS may not translate to safety-oriented criteria represented by NDS-USC. With EC-IoU, we reaffirm the benefit of safety-aware fine-tuning for improving safety-critical detection performance. Second, we conduct an ego-centric, safety-oriented evaluation of AV-infrastructure cooperative object detection models, underscoring its superiority over vehicle-only models and demonstrating a safety impact analysis that illustrates the potential contribution of cooperative models to "Vision Zero." Third, we integrate EC-IoU into SparseDrive and show that safety-aware perception hardening can reduce collision rate by nearly 30% and improve system-level safety directly in an end-to-end perception-to-planning framework. Overall, our results indicate that safety-aligned perception evaluation and optimization offer a practical path toward enhancing CAV safety across single-vehicle, cooperative, and end-to-end autonomy settings.

Paper Structure

This paper contains 36 sections, 19 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Asymmetry in the safety consequences of perception errors: The red prediction overestimates the truck’s longitudinal distance and under-covers it relative to the blue prediction, making it more safety-critical. Whereas standard metrics such as IoU and translation error (TE) assign identical values to both predictions, our safety-aligned metrics---USC and EC-IoU---favor the blue prediction as the safer one.
  • Figure 2: USC applies perspective-view (PV) and bird's-eye-view (BEV) projections to enforce two safety-relevant constraints. Adapted from Fig. 2 in liao2024usc © 2024 IEEE.
  • Figure 3: IoU and EC-IoU evaluation (with different $\alpha$ values) as a prediction moves over the ground truth along the x-axis. The ego vehicle is assumed at $x=0$, the ground truth $G$ is centered at $x=10$, and the blue box denotes an example prediction $P$ centered at $x=7$. Adapted from Fig. 3 in liao2024eciou © 2024 IEEE.
  • Figure 4: Benchmarking results of representative 3D object detectors on nuScenes, labeled by the sensor modality (C: camera, L: lidar, and F: fusion). Top: true-positive error measures (mAIoU, mATE$'$, mAUSC). Bottom: overall metrics (mAP, NDS, NDS-USC). Safety-oriented metrics tolerate certain non-critical errors but emphasize safety-critical mislocalizations, changing the performance picture compared with standard metrics.
  • Figure 5: Qualitative comparison of PETR fine-tuned with the standard loss (red) and our safety-aware EC-IoU loss (blue). EC-IoU shifts predictions toward the ego vehicle, improving coverage of safety-critical regions.
  • ...and 4 more figures