SOAP: Cross-sensor Domain Adaptation for 3D Object Detection Using Stationary Object Aggregation Pseudo-labelling
Chengjie Huang, Vahdat Abdelzad, Sean Sedwards, Krzysztof Czarnecki
TL;DR
SOAP addresses cross-sensor domain adaptation for LiDAR-based 3D object detection by using Scene-level Full-sequence Aggregation to densify stationary objects, coupled with Quasi-Stationary Training and Spatial Consistency Post-processing to generate high-quality pseudo-labels. By combining these pseudo-labels with a pre-trained detector, SOAP improves cross-domain detection and complements existing SOTA domain adaptation methods in both unsupervised and semi-supervised settings, closing substantial portions of the domain gap (e.g., over 30% and up to ~90% in some configurations). The approach is validated on nuScenes and Waymo with CenterPoint and VoxelNeXt backbones, showing strong gains at longer ranges (30–50 m) and across evaluation metrics (mAP, NDS, and Waymo AP). SOAP’s results demonstrate practical impact for robust cross-sensor deployment, enabling better detector transfer across evolving sensor hardware in autonomous systems.
Abstract
We consider the problem of cross-sensor domain adaptation in the context of LiDAR-based 3D object detection and propose Stationary Object Aggregation Pseudo-labelling (SOAP) to generate high quality pseudo-labels for stationary objects. In contrast to the current state-of-the-art in-domain practice of aggregating just a few input scans, SOAP aggregates entire sequences of point clouds at the input level to reduce the sensor domain gap. Then, by means of what we call quasi-stationary training and spatial consistency post-processing, the SOAP model generates accurate pseudo-labels for stationary objects, closing a minimum of 30.3% domain gap compared to few-frame detectors. Our results also show that state-of-the-art domain adaptation approaches can achieve even greater performance in combination with SOAP, in both the unsupervised and semi-supervised settings.
