Table of Contents
Fetching ...

DPO: Dual-Perturbation Optimization for Test-time Adaptation in 3D Object Detection

Zhuoxiao Chen, Zixin Wang, Yadan Luo, Sen Wang, Zi Huang

TL;DR

This work tackles performance degradation of LiDAR-based 3D detectors under test-time distribution shifts. It proposes Dual-Perturbation Optimization (DPO) for Test-Time Adaptation in 3D Object Detection (TTA-3OD), combining weight-space loss sharpness minimization with input-space adversarial perturbations to improve robustness. A reliable Hungarian matcher filters pseudo-labels and an early cutoff prevents error accumulation during online self-training. Across cross-dataset, corruption, and composite shifts, DPO achieves substantial gains (for example, Waymo to KITTI yields a 57.72% AP3D improvement over the strongest baseline and up to 91% of the fully supervised upper bound), demonstrating effective, privacy-preserving real-time adaptation for 3D detectors in diverse operating conditions. This approach advances practical deployment of 3D perception systems by enabling stable adaptation without access to labeled targets or multi-epoch retraining.

Abstract

LiDAR-based 3D object detection has seen impressive advances in recent times. However, deploying trained 3D detectors in the real world often yields unsatisfactory performance when the distribution of the test data significantly deviates from the training data due to different weather conditions, object sizes, \textit{etc}. A key factor in this performance degradation is the diminished generalizability of pre-trained models, which creates a sharp loss landscape during training. Such sharpness, when encountered during testing, can precipitate significant performance declines, even with minor data variations. To address the aforementioned challenges, we propose \textbf{dual-perturbation optimization (DPO)} for \textbf{\underline{T}est-\underline{t}ime \underline{A}daptation in \underline{3}D \underline{O}bject \underline{D}etection (TTA-3OD)}. We minimize the sharpness to cultivate a flat loss landscape to ensure model resiliency to minor data variations, thereby enhancing the generalization of the adaptation process. To fully capture the inherent variability of the test point clouds, we further introduce adversarial perturbation to the input BEV features to better simulate the noisy test environment. As the dual perturbation strategy relies on trustworthy supervision signals, we utilize a reliable Hungarian matcher to filter out pseudo-labels sensitive to perturbations. Additionally, we introduce early Hungarian cutoff to avoid error accumulation from incorrect pseudo-labels by halting the adaptation process. Extensive experiments across three types of transfer tasks demonstrate that the proposed DPO significantly surpasses previous state-of-the-art approaches, specifically on Waymo $\rightarrow$ KITTI, outperforming the most competitive baseline by 57.72\% in $\text{AP}_\text{3D}$ and reaching 91\% of the fully supervised upper bound.

DPO: Dual-Perturbation Optimization for Test-time Adaptation in 3D Object Detection

TL;DR

This work tackles performance degradation of LiDAR-based 3D detectors under test-time distribution shifts. It proposes Dual-Perturbation Optimization (DPO) for Test-Time Adaptation in 3D Object Detection (TTA-3OD), combining weight-space loss sharpness minimization with input-space adversarial perturbations to improve robustness. A reliable Hungarian matcher filters pseudo-labels and an early cutoff prevents error accumulation during online self-training. Across cross-dataset, corruption, and composite shifts, DPO achieves substantial gains (for example, Waymo to KITTI yields a 57.72% AP3D improvement over the strongest baseline and up to 91% of the fully supervised upper bound), demonstrating effective, privacy-preserving real-time adaptation for 3D detectors in diverse operating conditions. This approach advances practical deployment of 3D perception systems by enabling stable adaptation without access to labeled targets or multi-epoch retraining.

Abstract

LiDAR-based 3D object detection has seen impressive advances in recent times. However, deploying trained 3D detectors in the real world often yields unsatisfactory performance when the distribution of the test data significantly deviates from the training data due to different weather conditions, object sizes, \textit{etc}. A key factor in this performance degradation is the diminished generalizability of pre-trained models, which creates a sharp loss landscape during training. Such sharpness, when encountered during testing, can precipitate significant performance declines, even with minor data variations. To address the aforementioned challenges, we propose \textbf{dual-perturbation optimization (DPO)} for \textbf{\underline{T}est-\underline{t}ime \underline{A}daptation in \underline{3}D \underline{O}bject \underline{D}etection (TTA-3OD)}. We minimize the sharpness to cultivate a flat loss landscape to ensure model resiliency to minor data variations, thereby enhancing the generalization of the adaptation process. To fully capture the inherent variability of the test point clouds, we further introduce adversarial perturbation to the input BEV features to better simulate the noisy test environment. As the dual perturbation strategy relies on trustworthy supervision signals, we utilize a reliable Hungarian matcher to filter out pseudo-labels sensitive to perturbations. Additionally, we introduce early Hungarian cutoff to avoid error accumulation from incorrect pseudo-labels by halting the adaptation process. Extensive experiments across three types of transfer tasks demonstrate that the proposed DPO significantly surpasses previous state-of-the-art approaches, specifically on Waymo KITTI, outperforming the most competitive baseline by 57.72\% in and reaching 91\% of the fully supervised upper bound.
Paper Structure (41 sections, 12 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 41 sections, 12 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: (1) Loss contour for weight perturbation $\hat{\epsilon}_w$ (left); (2) The loss profile view for input perturbation $\hat{\epsilon}_z$ (right). Our goal is to optimize the loss towards flat minima while ensuring the model's resilience to data perturbations. Darker colors indicate lower loss values.
  • Figure 2: Illustration of the proposed Hungarian matcher for obtaining reliable supervision. We employ the Hungarian algorithm to compute the cost for each pseudo-labeled 3D box (i.e., predictions before perturbation) when paired with its optimally matched counterpart in predictions after perturbation. The reliability of the 3D boxes is categorized into three tiers—high, medium, and low—based on the computed matching cost. During TTA, only 3D boxes of high reliability (e.g., ID 1, 2) are used for updating model weights, and those of low reliability (e.g., ID 4) are treated as background.
  • Figure 3: Results ($\text{AP}_{\text{3D}}$) of adapting across composite shifts (Waymo $\rightarrow$ KITTI-C) at the heavy corruption level. Lighter shades indicate higher performance.
  • Figure 4: Sensitivity to radius $\rho$ in SAM (left), and the pseudo-label threshold $\alpha$ (right) on nuScenes $\rightarrow$ KITTI.
  • Figure 5: Performance trend and variation in the number of point clouds for weight updates across different early-stopping thresholds $C_\text{stop}$.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Definition 3.1: Loss Sharpness