Domain Generalization of 3D Object Detection by Density-Resampling
Shuangzhi Li, Lei Ma, Xingyu Li
TL;DR
This paper tackles the challenge of single-domain generalization for LiDAR-based 3D object detection by addressing domain shifts arising from point-density variations and sensor differences. It introduces physical-aware density-resampling data augmentation (PDDA) to simulate realistic density patterns, and a multi-task learning framework that couples standard detection with a self-supervised 3D scene restoration task to improve scene understanding. Additionally, it proposes a test-time adaptation strategy that uses the restoration objective to fine-tune the encoder on unseen target domains, further bridging domain gaps. Across cross-dataset evaluations on Car, Pedestrian, and Cyclist detections, the method consistently outperforms state-of-the-art DG approaches and, in some cases, even surpasses unsupervised domain adaptation methods, demonstrating strong practical impact for robust 3D perception in real-world, heterogeneous sensing conditions.
Abstract
Point-cloud-based 3D object detection suffers from performance degradation when encountering data with novel domain gaps. To tackle it, the single-domain generalization (SDG) aims to generalize the detection model trained in a limited single source domain to perform robustly on unexplored domains. In this paper, we propose an SDG method to improve the generalizability of 3D object detection to unseen target domains. Unlike prior SDG works for 3D object detection solely focusing on data augmentation, our work introduces a novel data augmentation method and contributes a new multi-task learning strategy in the methodology. Specifically, from the perspective of data augmentation, we design a universal physical-aware density-based data augmentation (PDDA) method to mitigate the performance loss stemming from diverse point densities. From the learning methodology viewpoint, we develop a multi-task learning for 3D object detection: during source training, besides the main standard detection task, we leverage an auxiliary self-supervised 3D scene restoration task to enhance the comprehension of the encoder on background and foreground details for better recognition and detection of objects. Furthermore, based on the auxiliary self-supervised task, we propose the first test-time adaptation method for domain generalization of 3D object detection, which efficiently adjusts the encoder's parameters to adapt to unseen target domains during testing time, to further bridge domain gaps. Extensive cross-dataset experiments covering "Car", "Pedestrian", and "Cyclist" detections, demonstrate our method outperforms state-of-the-art SDG methods and even overpass unsupervised domain adaptation methods under some circumstances.
