Reliable Student: Addressing Noise in Semi-Supervised 3D Object Detection
Farzad Nozarian, Shashank Agarwal, Farzaneh Rezaeianaran, Danish Shahzad, Atanas Poibrenski, Christian Müller, Philipp Slusallek
TL;DR
This work addresses the impact of noisy pseudo-labels in semi-supervised 3D object detection by introducing the Reliable Student framework, which leverages a mean-teacher setup to align student RoIs with teacher pseudo-labels while using teacher-derived reliability scores to weight training. It combines class-aware target assignment with local foreground thresholds and a top-k IoU sampling strategy, along with reliability-based loss weighting to suppress false positives and negatives. On KITTI, the approach achieves state-of-the-art results in low-label regimes, notably a 6.2% AP improvement for pedestrians at 1% labeled data and substantial gains for pedestrians and cyclists at 2% labeled data, outperforming 3DIoUMatch and adapted baselines. These contributions offer robust handling of pseudo-label noise and indicate practical gains for semi-supervised 3D detection in autonomous driving contexts.
Abstract
Semi-supervised 3D object detection can benefit from the promising pseudo-labeling technique when labeled data is limited. However, recent approaches have overlooked the impact of noisy pseudo-labels during training, despite efforts to enhance pseudo-label quality through confidence-based filtering. In this paper, we examine the impact of noisy pseudo-labels on IoU-based target assignment and propose the Reliable Student framework, which incorporates two complementary approaches to mitigate errors. First, it involves a class-aware target assignment strategy that reduces false negative assignments in difficult classes. Second, it includes a reliability weighting strategy that suppresses false positive assignment errors while also addressing remaining false negatives from the first step. The reliability weights are determined by querying the teacher network for confidence scores of the student-generated proposals. Our work surpasses the previous state-of-the-art on KITTI 3D object detection benchmark on point clouds in the semi-supervised setting. On 1% labeled data, our approach achieves a 6.2% AP improvement for the pedestrian class, despite having only 37 labeled samples available. The improvements become significant for the 2% setting, achieving 6.0% AP and 5.7% AP improvements for the pedestrian and cyclist classes, respectively.
