Multi-view Structural Convolution Network for Domain-Invariant Point Cloud Recognition of Autonomous Vehicles
Younggun Kim, Mohamed Abdel-Aty, Beomsik Cho, Seonghoon Ryoo, Soomok Lee
TL;DR
The paper tackles domain shifts in LiDAR-based point cloud recognition for autonomous vehicles by introducing MSCN, a network that extracts robust local and global geometric features via Structural Convolution Layers and Structural Aggregation Layers. It further strengthens domain invariance through unseen-domain generation using an adapted Progressive Domain Expansion framework, enabling training with synthetic domain variants. The approach achieves an average cross-domain accuracy of 82.0%, outperforming the PointTransformer baseline by 15.8%, with MSCN+ showing additional gains, and demonstrates real-time feasibility with fast inference on AV-scale data. Collectively, MSCN provides a practical path toward reliable domain-generalized perception in diverse sensing conditions and road environments.
Abstract
Point cloud representation has recently become a research hotspot in the field of computer vision and has been utilized for autonomous vehicles. However, adapting deep learning networks for point cloud data recognition is challenging due to the variability in datasets and sensor technologies. This variability underscores the necessity for adaptive techniques to maintain accuracy under different conditions. In this paper, we present the Multi-View Structural Convolution Network (MSCN) designed for domain-invariant point cloud recognition. MSCN comprises Structural Convolution Layers (SCL) that extract local context geometric features from point clouds and Structural Aggregation Layers (SAL) that extract and aggregate both local and overall context features from point clouds. Furthermore, MSCN enhances feature robustness by training with unseen domain point clouds generated from the source domain, enabling the model to acquire domain-invariant representations. Extensive cross-domain experiments demonstrate that MSCN achieves an average accuracy of 82.0%, surpassing the strong baseline PointTransformer by 15.8%, confirming its effectiveness under real-world domain shifts. Our code is available at https://github.com/MLMLab/MSCN.
