STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning

Yanan Zhang; Chao Zhou; Di Huang

STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning

Yanan Zhang, Chao Zhou, Di Huang

TL;DR

A novel unsupervised domain adaptation framework for 3D object detection via collaborating ST and AL, dubbed as STAL3D, unleashing the complementary advantages of pseudo labels and feature distribution alignment is proposed.

Abstract

Existing 3D object detection suffers from expensive annotation costs and poor transferability to unknown data due to the domain gap, Unsupervised Domain Adaptation (UDA) aims to generalize detection models trained in labeled source domains to perform robustly on unexplored target domains, providing a promising solution for cross-domain 3D object detection. Although Self-Training (ST) based cross-domain 3D detection methods with the assistance of pseudo-labeling techniques have achieved remarkable progress, they still face the issue of low-quality pseudo-labels when there are significant domain disparities due to the absence of a process for feature distribution alignment. While Adversarial Learning (AL) based methods can effectively align the feature distributions of the source and target domains, the inability to obtain labels in the target domain forces the adoption of asymmetric optimization losses, resulting in a challenging issue of source domain bias. To overcome these limitations, we propose a novel unsupervised domain adaptation framework for 3D object detection via collaborating ST and AL, dubbed as STAL3D, unleashing the complementary advantages of pseudo labels and feature distribution alignment. Additionally, a Background Suppression Adversarial Learning (BS-AL) module and a Scale Filtering Module (SFM) are designed tailored for 3D cross-domain scenes, effectively alleviating the issues of the large proportion of background interference and source domain size bias. Our STAL3D achieves state-of-the-art performance on multiple cross-domain tasks and even surpasses the Oracle results on Waymo $\rightarrow$ KITTI and Waymo $\rightarrow$ KITTI-rain.

STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning

TL;DR

Abstract

KITTI and Waymo

KITTI-rain.

Paper Structure (15 sections, 8 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 15 sections, 8 equations, 9 figures, 6 tables, 1 algorithm.

Introduction
Related Work
LiDAR-based 3D Object Detection
Cross-domain 3D Object Detection
Method
Framework Overview
Source Domain Pre-training
Self-Training
Background Suppression Adversarial Learning
Scale Filtering Module
Experiments
Experimental Setup
Main Results
Ablation Study
Conclusion

Figures (9)

Figure 1: Domain Adaptation Paradigms for 3D object detction. (a) Self-Training Based Domain Adaptation (ST); (b) Adversarial Learning Based Domain Adaptation (AL); (c) Collaborating Self-Training and Adversarial Learning (Our STAL3D).
Figure 2: Overview of the proposed STAL3D framework. The training process includes two stages: (I) Source Pre-training Stage; (II) Self-Training and Adversarial Learning Stage. In stage (I), Random Object Scaling (ROS) data augmentation is used for pre-training on the source domain dataset to obtain model initialization. In stage (II), the model is training via collaborating Self-Training with our Scale Filter Module (SFM) and Background Suppression Adversarial Learning (BS-AL), unleashing the complementary advantages of pseudo-labels and feature distribution alignment.
Figure 3: Background Suppression Adversarial Learning. A Feature Richness Score (FRS) is used to divide the entire scene into a learning region and a suppression region. Using FRS map to weight the original adversarial training loss, the model focuses on more valuable foreground regions, avoiding noise interference introduced by background.
Figure 4: Visualization of the Feature Richness Score (FRS) map. The top row represents various point cloud scenes, and the bottom row represents the corresponding FRS map. The brighter the color, the higher the score.
Figure 5: Scale Filtering Module. By introducing the loss design of scale regression term filtering, the issue of source domain size bias can be effectively alleviated.
...and 4 more figures

STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning

TL;DR

Abstract

STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)