SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration
Kezheng Xiong, Maoji Zheng, Qingshan Xu, Chenglu Wen, Siqi Shen, Cheng Wang
TL;DR
SPEAL addresses cross-source point cloud registration under noise, outliers, and density/scale variations by embedding skeletal priors into a transformer-based pipeline. It introduces Skeleton Extraction Module (SEM), Skeleton-Aware GeoTransformer (SAGTR), and Correspondence Dual-Sampler (CDS), enabling skeleton-informed coarse correspondences and robust refinement. Across KITTI CrossSource and GermanyForest3D benchmarks, SPEAL achieves state-of-the-art cross-source performance and competitive same-source results, with ablations confirming the essential roles of skeletal priors and spectral denoising. This approach meaningfully enhances robustness in unstructured, real-world 3D scenes and provides a new direction for skeletal-prior guided registration.
Abstract
Point cloud registration, a fundamental task in 3D computer vision, has remained largely unexplored in cross-source point clouds and unstructured scenes. The primary challenges arise from noise, outliers, and variations in scale and density. However, neglected geometric natures of point clouds restricts the performance of current methods. In this paper, we propose a novel method termed SPEAL to leverage skeletal representations for effective learning of intrinsic topologies of point clouds, facilitating robust capture of geometric intricacy. Specifically, we design the Skeleton Extraction Module to extract skeleton points and skeletal features in an unsupervised manner, which is inherently robust to noise and density variances. Then, we propose the Skeleton-Aware GeoTransformer to encode high-level skeleton-aware features. It explicitly captures the topological natures and inter-point-cloud skeletal correlations with the noise-robust and density-invariant skeletal representations. Next, we introduce the Correspondence Dual-Sampler to facilitate correspondences by augmenting the correspondence set with skeletal correspondences. Furthermore, we construct a challenging novel large-scale cross-source point cloud dataset named KITTI CrossSource for benchmarking cross-source point cloud registration methods. Extensive quantitative and qualitative experiments are conducted to demonstrate our approach's superiority and robustness on both cross-source and same-source datasets. To the best of our knowledge, our approach is the first to facilitate point cloud registration with skeletal geometric priors.
