Density-guided Translator Boosts Synthetic-to-Real Unsupervised Domain Adaptive Segmentation of 3D Point Clouds
Zhimin Yuan, Wankang Zeng, Yanfei Su, Weiquan Liu, Ming Cheng, Yulan Guo, Cheng Wang
TL;DR
This work tackles the challenge of 3D synthetic-to-real unsupervised domain adaptive segmentation by addressing two core gaps: input-level density differences and poor initialization for self-training. It introduces a non-learnable density-guided translator (DGT) to align point density across domains and a two-stage pipeline (DGT-ST) that first uses a prototype-guided category-level adversarial network (PCAN) for a strong initialization, followed by source-aware consistency LaserMix (SAC-LM) within a mean-teacher framework to refine domain-invariant features. The approach yields substantial improvements on two synthetic-to-real benchmarks, achieving up to 9.4% and 4.3% gains in mean IoU compared to state-of-the-art baselines, and demonstrates strong performance on both SemanticKITTI and SemanticPOSS targets. The combination of input-level density alignment, prototype-informed adversarial alignment, and consistency-based self-training provides a practical and effective pathway for robust 3D UDA in real-world LiDAR applications.
Abstract
3D synthetic-to-real unsupervised domain adaptive segmentation is crucial to annotating new domains. Self-training is a competitive approach for this task, but its performance is limited by different sensor sampling patterns (i.e., variations in point density) and incomplete training strategies. In this work, we propose a density-guided translator (DGT), which translates point density between domains, and integrates it into a two-stage self-training pipeline named DGT-ST. First, in contrast to existing works that simultaneously conduct data generation and feature/output alignment within unstable adversarial training, we employ the non-learnable DGT to bridge the domain gap at the input level. Second, to provide a well-initialized model for self-training, we propose a category-level adversarial network in stage one that utilizes the prototype to prevent negative transfer. Finally, by leveraging the designs above, a domain-mixed self-training method with source-aware consistency loss is proposed in stage two to narrow the domain gap further. Experiments on two synthetic-to-real segmentation tasks (SynLiDAR $\rightarrow$ semanticKITTI and SynLiDAR $\rightarrow$ semanticPOSS) demonstrate that DGT-ST outperforms state-of-the-art methods, achieving 9.4$\%$ and 4.3$\%$ mIoU improvements, respectively. Code is available at \url{https://github.com/yuan-zm/DGT-ST}.
