Towards Stable 3D Object Detection
Jiabao Wang, Qiang Meng, Guochao Liu, Liujiang Yan, Ke Wang, Ming-Ming Cheng, Qibin Hou
TL;DR
This work addresses the overlooked problem of temporal stability in 3D object detection for autonomous driving by introducing the Stability Index (SI), a comprehensive metric that jointly evaluates confidence, box localization, extent, and heading stability. It analyzes the limitations of existing metrics and proposes a principled stability framework featuring projection with a pivot box, element decoupling, and aggregation to ensure symmetry and marginal unimodality. To improve stability, the authors propose Prediction Consistency Learning (PCL), a training strategy that enforces cross-frame prediction consistency under augmentations without affecting inference cost. Empirical results on the Waymo Open Dataset show SI as a complementary measure to mAPH, and demonstrate that PCL can significantly boost SI (e.g., CenterPoint vehicle SI from 80.52 to 86.00) while maintaining or modestly affecting accuracy, indicating practical benefits for safer autonomous driving systems.
Abstract
In autonomous driving, the temporal stability of 3D object detection greatly impacts the driving safety. However, the detection stability cannot be accessed by existing metrics such as mAP and MOTA, and consequently is less explored by the community. To bridge this gap, this work proposes Stability Index (SI), a new metric that can comprehensively evaluate the stability of 3D detectors in terms of confidence, box localization, extent, and heading. By benchmarking state-of-the-art object detectors on the Waymo Open Dataset, SI reveals interesting properties of object stability that have not been previously discovered by other metrics. To help models improve their stability, we further introduce a general and effective training strategy, called Prediction Consistency Learning (PCL). PCL essentially encourages the prediction consistency of the same objects under different timestamps and augmentations, leading to enhanced detection stability. Furthermore, we examine the effectiveness of PCL with the widely-used CenterPoint, and achieve a remarkable SI of 86.00 for vehicle class, surpassing the baseline by 5.48. We hope our work could serve as a reliable baseline and draw the community's attention to this crucial issue in 3D object detection. Codes will be made publicly available.
