Test-time adaptation for geospatial point cloud semantic segmentation with distinct domain shifts
Puzuo Wang, Wei Yao, Jie Shao, Zhiyi He
TL;DR
This work tackles test-time adaptation for geospatial point cloud semantic segmentation under three practical domain shifts by updating BN statistics progressively and optimizing BN affine parameters via self-supervision. The proposed framework combines progressive batch normalization (PBN) with information maximization and reliability-constrained pseudo-labeling to adapt a pre-trained model to unlabeled target data during inference, without access to source data. Across photogrammetric-to-ALS, ALS-to-MLS, and synthetic-to-MLS transfers, the method yields significant improvements in $mIoU$ and $OA$ (e.g., up to $mIoU$ gains of ~20 percentage points and demonstrations on SensatUrban to Hessigheim 3D: $mIoU=59.46\%$, $OA=85.97\%$). The results show that BN-centered adaptation is effective, robust to batch variations, and broadly applicable across backbones, providing a practical, privacy-preserving path for real-time domain adaptation in geospatial PCSS.
Abstract
Domain adaptation (DA) techniques help deep learning models generalize across data shifts for point cloud semantic segmentation (PCSS). Test-time adaptation (TTA) allows direct adaptation of a pre-trained model to unlabeled data during inference stage without access to source data or additional training, avoiding privacy issues and large computational resources. We address TTA for geospatial PCSS by introducing three domain shift paradigms: photogrammetric to airborne LiDAR, airborne to mobile LiDAR, and synthetic to mobile laser scanning. We propose a TTA method that progressively updates batch normalization (BN) statistics with each testing batch. Additionally, a self-supervised learning module optimizes learnable BN affine parameters. Information maximization and reliability-constrained pseudo-labeling improve prediction confidence and supply supervisory signals. Experimental results show our method improves classification accuracy by up to 20\% mIoU, outperforming other methods. For photogrammetric (SensatUrban) to airborne (Hessigheim 3D) adaptation at the inference stage, our method achieves 59.46\% mIoU and 85.97\% OA without retraining or fine-turning.
