Robustness-Aware 3D Object Detection in Autonomous Driving: A Review and Outlook
Ziying Song, Lin Liu, Feiyang Jia, Yadan Luo, Guoxin Zhang, Lei Yang, Li Wang, Caiyan Jia
TL;DR
This review addresses robustness in 3D object detection for autonomous driving, emphasizing that practical safety requires maintaining performance under environmental variations, sensor noise, and calibration misalignment. It surveys camera-only, LiDAR-only, and multi-modal detectors, introducing a robustness-focused taxonomy and evaluating methods on corruption benchmarks such as KITTI-C and nuScenes-C to compare accuracy, latency, and robustness. The analysis shows multi-modal fusion generally offers superior robustness, while single-modality approaches are more vulnerable to noise and environmental changes, underscoring the need for robustness-aware design and evaluation. The paper aims to guide deployment and future research toward robustness-centric, real-world-ready perception systems for safe autonomous driving.
Abstract
In the realm of modern autonomous driving, the perception system is indispensable for accurately assessing the state of the surrounding environment, thereby enabling informed prediction and planning. The key step to this system is related to 3D object detection that utilizes vehicle-mounted sensors such as LiDAR and cameras to identify the size, the category, and the location of nearby objects. Despite the surge in 3D object detection methods aimed at enhancing detection precision and efficiency, there is a gap in the literature that systematically examines their resilience against environmental variations, noise, and weather changes. This study emphasizes the importance of robustness, alongside accuracy and latency, in evaluating perception systems under practical scenarios. Our work presents an extensive survey of camera-only, LiDAR-only, and multi-modal 3D object detection algorithms, thoroughly evaluating their trade-off between accuracy, latency, and robustness, particularly on datasets like KITTI-C and nuScenes-C to ensure fair comparisons. Among these, multi-modal 3D detection approaches exhibit superior robustness, and a novel taxonomy is introduced to reorganize the literature for enhanced clarity. This survey aims to offer a more practical perspective on the current capabilities and the constraints of 3D object detection algorithms in real-world applications, thus steering future research towards robustness-centric advancements.
