Table of Contents
Fetching ...

Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection

Yecheol Kim, Junho Lee, Changsoo Park, Hyoung won Kim, Inho Lim, Christopher Chang, Jun Won Choi

TL;DR

TODA tackles domain shift in LiDAR-based 3D object detection under semi-supervised domain adaptation by introducing a two-stage data augmentation framework. TargetMix aligns source-domain LiDAR distributions with the target and performs cross-domain mixups in a LiDAR-aware polar space, while AdvMix augments unlabeled target data adversarially and through mixup to reduce intra-domain discrepancy. A teacher–student training regime uses TargetMix-generated data to form a strong teacher, then leverages pseudo-labels and adversarially perturbed samples to train a robust student with a consistency loss. Empirical results on Waymo→nuScenes and nuScenes→KITTI show TODA consistently outperforms prior SSDA methods, achieving large gains at very low target labeling and approaching Oracle performance with modest annotation effort.

Abstract

3D object detection is crucial for applications like autonomous driving and robotics. However, in real-world environments, variations in sensor data distribution due to sensor upgrades, weather changes, and geographic differences can adversely affect detection performance. Semi-Supervised Domain Adaptation (SSDA) aims to mitigate these challenges by transferring knowledge from a source domain, abundant in labeled data, to a target domain where labels are scarce. This paper presents a new SSDA method referred to as Target-Oriented Domain Augmentation (TODA) specifically tailored for LiDAR-based 3D object detection. TODA efficiently utilizes all available data, including labeled data in the source domain, and both labeled data and unlabeled data in the target domain to enhance domain adaptation performance. TODA consists of two stages: TargetMix and AdvMix. TargetMix employs mixing augmentation accounting for LiDAR sensor characteristics to facilitate feature alignment between the source-domain and target-domain. AdvMix applies point-wise adversarial augmentation with mixing augmentation, which perturbs the unlabeled data to align the features within both labeled and unlabeled data in the target domain. Our experiments conducted on the challenging domain adaptation tasks demonstrate that TODA outperforms existing domain adaptation techniques designed for 3D object detection by significant margins. The code is available at: https://github.com/rasd3/TODA.

Semi-Supervised Domain Adaptation Using Target-Oriented Domain Augmentation for 3D Object Detection

TL;DR

TODA tackles domain shift in LiDAR-based 3D object detection under semi-supervised domain adaptation by introducing a two-stage data augmentation framework. TargetMix aligns source-domain LiDAR distributions with the target and performs cross-domain mixups in a LiDAR-aware polar space, while AdvMix augments unlabeled target data adversarially and through mixup to reduce intra-domain discrepancy. A teacher–student training regime uses TargetMix-generated data to form a strong teacher, then leverages pseudo-labels and adversarially perturbed samples to train a robust student with a consistency loss. Empirical results on Waymo→nuScenes and nuScenes→KITTI show TODA consistently outperforms prior SSDA methods, achieving large gains at very low target labeling and approaching Oracle performance with modest annotation effort.

Abstract

3D object detection is crucial for applications like autonomous driving and robotics. However, in real-world environments, variations in sensor data distribution due to sensor upgrades, weather changes, and geographic differences can adversely affect detection performance. Semi-Supervised Domain Adaptation (SSDA) aims to mitigate these challenges by transferring knowledge from a source domain, abundant in labeled data, to a target domain where labels are scarce. This paper presents a new SSDA method referred to as Target-Oriented Domain Augmentation (TODA) specifically tailored for LiDAR-based 3D object detection. TODA efficiently utilizes all available data, including labeled data in the source domain, and both labeled data and unlabeled data in the target domain to enhance domain adaptation performance. TODA consists of two stages: TargetMix and AdvMix. TargetMix employs mixing augmentation accounting for LiDAR sensor characteristics to facilitate feature alignment between the source-domain and target-domain. AdvMix applies point-wise adversarial augmentation with mixing augmentation, which perturbs the unlabeled data to align the features within both labeled and unlabeled data in the target domain. Our experiments conducted on the challenging domain adaptation tasks demonstrate that TODA outperforms existing domain adaptation techniques designed for 3D object detection by significant margins. The code is available at: https://github.com/rasd3/TODA.
Paper Structure (24 sections, 6 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 24 sections, 6 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: Performance evaluation in a domain adaptation task from Waymo dataset to nuScenes dataset: 0.5%, 1%, and 5% labeled data in the target domain are used. A SSDA method using only 0.5% of the target label results in a remarkable performance gain over a UDA method (ST3D st3d). Our TODA also significantly outperforms SSDA3D ssda3d in all settings. Surprisingly, TODA even surpasses the Oracle performance with only 5% labels.
  • Figure 2: Overall architecture of the proposed TODA: First, TargetMix aligns the source-domain data with target-domain data by applying LiDAR Distribution Matching, followed by mixup augmentation in polar coordinates. Then, AdvMix utilizes Adversarial Point Augmentation to perturb the unlabeled data in the target domain, aiming to produce consistent representation of both labeled and unlabeled data. 'P', 'A', and 'M' denote Polar Coordinate-based Mix, Adversarial Point Augmentation, and Point-Mixup respectively.
  • Figure 3: Comparison of TargetMix with PolarMix: TargetMix divides the entire azimuth angle into $2K$ separate sectors while PolarMix divides it into two sectors.
  • Figure 4: t-SNE visualization of features: (a) unlabeled data (green) versus labeled data (red), (b) adversarial examples (green) versus labeled data (red) within the target domain. These features are extracted from the final layer of the teacher model trained with TargetMix.
  • Figure 5: Comparison with SSDA3D in each stage for different percentages: Performance comparison between TODA and SSDA3D across various sizes of labeled data (0.1%, 0.2%, 0.5%, 1%, 5%, and 10%)
  • ...and 1 more figures