Table of Contents
Fetching ...

Reliable Student: Addressing Noise in Semi-Supervised 3D Object Detection

Farzad Nozarian, Shashank Agarwal, Farzaneh Rezaeianaran, Danish Shahzad, Atanas Poibrenski, Christian Müller, Philipp Slusallek

TL;DR

This work addresses the impact of noisy pseudo-labels in semi-supervised 3D object detection by introducing the Reliable Student framework, which leverages a mean-teacher setup to align student RoIs with teacher pseudo-labels while using teacher-derived reliability scores to weight training. It combines class-aware target assignment with local foreground thresholds and a top-k IoU sampling strategy, along with reliability-based loss weighting to suppress false positives and negatives. On KITTI, the approach achieves state-of-the-art results in low-label regimes, notably a 6.2% AP improvement for pedestrians at 1% labeled data and substantial gains for pedestrians and cyclists at 2% labeled data, outperforming 3DIoUMatch and adapted baselines. These contributions offer robust handling of pseudo-label noise and indicate practical gains for semi-supervised 3D detection in autonomous driving contexts.

Abstract

Semi-supervised 3D object detection can benefit from the promising pseudo-labeling technique when labeled data is limited. However, recent approaches have overlooked the impact of noisy pseudo-labels during training, despite efforts to enhance pseudo-label quality through confidence-based filtering. In this paper, we examine the impact of noisy pseudo-labels on IoU-based target assignment and propose the Reliable Student framework, which incorporates two complementary approaches to mitigate errors. First, it involves a class-aware target assignment strategy that reduces false negative assignments in difficult classes. Second, it includes a reliability weighting strategy that suppresses false positive assignment errors while also addressing remaining false negatives from the first step. The reliability weights are determined by querying the teacher network for confidence scores of the student-generated proposals. Our work surpasses the previous state-of-the-art on KITTI 3D object detection benchmark on point clouds in the semi-supervised setting. On 1% labeled data, our approach achieves a 6.2% AP improvement for the pedestrian class, despite having only 37 labeled samples available. The improvements become significant for the 2% setting, achieving 6.0% AP and 5.7% AP improvements for the pedestrian and cyclist classes, respectively.

Reliable Student: Addressing Noise in Semi-Supervised 3D Object Detection

TL;DR

This work addresses the impact of noisy pseudo-labels in semi-supervised 3D object detection by introducing the Reliable Student framework, which leverages a mean-teacher setup to align student RoIs with teacher pseudo-labels while using teacher-derived reliability scores to weight training. It combines class-aware target assignment with local foreground thresholds and a top-k IoU sampling strategy, along with reliability-based loss weighting to suppress false positives and negatives. On KITTI, the approach achieves state-of-the-art results in low-label regimes, notably a 6.2% AP improvement for pedestrians at 1% labeled data and substantial gains for pedestrians and cyclists at 2% labeled data, outperforming 3DIoUMatch and adapted baselines. These contributions offer robust handling of pseudo-label noise and indicate practical gains for semi-supervised 3D detection in autonomous driving contexts.

Abstract

Semi-supervised 3D object detection can benefit from the promising pseudo-labeling technique when labeled data is limited. However, recent approaches have overlooked the impact of noisy pseudo-labels during training, despite efforts to enhance pseudo-label quality through confidence-based filtering. In this paper, we examine the impact of noisy pseudo-labels on IoU-based target assignment and propose the Reliable Student framework, which incorporates two complementary approaches to mitigate errors. First, it involves a class-aware target assignment strategy that reduces false negative assignments in difficult classes. Second, it includes a reliability weighting strategy that suppresses false positive assignment errors while also addressing remaining false negatives from the first step. The reliability weights are determined by querying the teacher network for confidence scores of the student-generated proposals. Our work surpasses the previous state-of-the-art on KITTI 3D object detection benchmark on point clouds in the semi-supervised setting. On 1% labeled data, our approach achieves a 6.2% AP improvement for the pedestrian class, despite having only 37 labeled samples available. The improvements become significant for the 2% setting, achieving 6.0% AP and 5.7% AP improvements for the pedestrian and cyclist classes, respectively.
Paper Structure (16 sections, 4 equations, 6 figures, 4 tables)

This paper contains 16 sections, 4 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Illustrates the need for class-aware foreground thresholds for foreground/background target assignment. The $\mathrm{IoU_{FG}}$ on the x-axis shows the IoU of proposals with respect to pseudo-labels that are foreground relative to ground truths. (a) The default class-agnostic threshold in the PV-RCNN baseline. (b) Our class-aware thresholds. Lowering the threshold and including more foreground proposals can benefit challenging and uncommon classes. It also significantly reduces false negatives with IoUs close to zero. (Best viewed in color)
  • Figure 2: Overview of our Reliable Student framework. It uses a teacher-student network, where the EMA teacher produces high-quality pseudo-label boxes $b_i$. We compute the IoU $u_i$ between $b_i$ and the student's post-NMS proposals $r_i$, followed by a top-k sampling of $r_i$ based on $u_i$. The sampled proposals $r_i$ are injected into the student and teacher RCNN heads to predict the objectness scores $\tilde{s}_i$ and $\hat{s}_i$, respectively. While $\tilde{s}_i$ serves as an input to the RCNN classification loss $\mathcal{L}^{cls}_u$, $\hat{s}_i$ are converted into reliability weights $w_i$ for $\mathcal{L}^{cls}_u$. The class-aware target assignment module uses thresholds for different classes on $u_i$ to assign objectness targets $t_i$ for $\mathcal{L}^{cls}_u$.
  • Figure 3: Illustrates the density of IoU values of proposals with their matched PL ($\mathrm{u}_\mathrm{i}$) and GT ($\mathrm{v}_\mathrm{i}$) on the x-axis and y-axis, respectively. Denser regions are shown with darker shades. The red and orange vertical lines denote the local foreground (FG) ($\mathrm{\tau^{fg}_{c}}$) and background (BG) ($\mathrm{\tau^{bg}}$) thresholds, while the black horizontal line represents the FG threshold ($\mathrm{\Delta_{c}}$) for the evaluation mode, dividing the plot into six subregions. Subregions (a) and (f) represent false negative and true negative proposals, respectively. (b) and (e) depict proposals lying in the uncertain region and are assigned with soft targets, while (c) and (d) depict true positive and false positive proposals, respectively. The proposals are obtained from the last few training iterations. We also omit proposals that are in the background with respect to both GT and PL for better visualization. All three plots follow the same subregion breakdown. (Best viewed in color)
  • Figure 4: Illustrates the assigned reliability weights for RCNN classification loss based on the IoU of the proposals with PLs ($\mathrm{u}_\mathrm{i}$) on the x-axis and GT ($\mathrm{v}_\mathrm{i}$) on the y-axis. The red and orange vertical lines depict the local class-aware foreground (FG) ($\mathrm{\tau^{fg}_{c}}$) and background (BG) ($\mathrm{\tau^{bg}}$) thresholds, respectively, while the black horizontal line represents the FG threshold ($\mathrm{\Delta_{c}}$) for the evaluation mode. The color bar on the right shows the intensity of the reliability weights. Plots are based on the last few training iterations for better visualization.
  • Figure 5: Teacher's mean reliability weights, averaged over every few iterations, using the $\mathrm{FG+UC_{FP}+BG}$ weighting type.
  • ...and 1 more figures