SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration

Kezheng Xiong; Maoji Zheng; Qingshan Xu; Chenglu Wen; Siqi Shen; Cheng Wang

SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration

Kezheng Xiong, Maoji Zheng, Qingshan Xu, Chenglu Wen, Siqi Shen, Cheng Wang

TL;DR

SPEAL addresses cross-source point cloud registration under noise, outliers, and density/scale variations by embedding skeletal priors into a transformer-based pipeline. It introduces Skeleton Extraction Module (SEM), Skeleton-Aware GeoTransformer (SAGTR), and Correspondence Dual-Sampler (CDS), enabling skeleton-informed coarse correspondences and robust refinement. Across KITTI CrossSource and GermanyForest3D benchmarks, SPEAL achieves state-of-the-art cross-source performance and competitive same-source results, with ablations confirming the essential roles of skeletal priors and spectral denoising. This approach meaningfully enhances robustness in unstructured, real-world 3D scenes and provides a new direction for skeletal-prior guided registration.

Abstract

Point cloud registration, a fundamental task in 3D computer vision, has remained largely unexplored in cross-source point clouds and unstructured scenes. The primary challenges arise from noise, outliers, and variations in scale and density. However, neglected geometric natures of point clouds restricts the performance of current methods. In this paper, we propose a novel method termed SPEAL to leverage skeletal representations for effective learning of intrinsic topologies of point clouds, facilitating robust capture of geometric intricacy. Specifically, we design the Skeleton Extraction Module to extract skeleton points and skeletal features in an unsupervised manner, which is inherently robust to noise and density variances. Then, we propose the Skeleton-Aware GeoTransformer to encode high-level skeleton-aware features. It explicitly captures the topological natures and inter-point-cloud skeletal correlations with the noise-robust and density-invariant skeletal representations. Next, we introduce the Correspondence Dual-Sampler to facilitate correspondences by augmenting the correspondence set with skeletal correspondences. Furthermore, we construct a challenging novel large-scale cross-source point cloud dataset named KITTI CrossSource for benchmarking cross-source point cloud registration methods. Extensive quantitative and qualitative experiments are conducted to demonstrate our approach's superiority and robustness on both cross-source and same-source datasets. To the best of our knowledge, our approach is the first to facilitate point cloud registration with skeletal geometric priors.

SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration

TL;DR

Abstract

Paper Structure (14 sections, 11 equations, 7 figures, 4 tables)

This paper contains 14 sections, 11 equations, 7 figures, 4 tables.

Introduction
Related Work
Method
Skeleton Extraction Module
Skeleton-Aware GeoTransformer
Correspondence Dual-Sampler
Losses
Experiments
Datasets and Experimental Setup
Cross-Source Results
Same-Source Results
Analysis
Ablation Studies
Conclusion

Figures (7)

Figure 1: Impossible Triangle of Current Methods. Registration recalls under different settings are shown: KITTI Odometry (Same-Source), KITTI CrossSource and the low-overlap test split of KITTI CrossSource. Existing methods fail to perform as well as SPEAL on all three challenging circumstances.
Figure 2: The Overall Pipeline of SPEAL. The backbone extracts superpoints and multi-level features from $\mathcal{P}$ and $\mathcal{Q}$. Then, SEM and SAGTR extract skeletal representations and learn discriminative skeleton-aware features, respectively. Finally, CDS extracts hybrid coarse correspondences with skeletal priors. The result transformation is computed with LGR.
Figure 3: The structure (left) and computational graph (right) of skeleton-aware geometric self-attention.
Figure 4: The computation of skeleton-aware structure embedding.
Figure 5: The structure (left) and computational graph (right) of skeleton-aware cross-attention.
...and 2 more figures

SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration

TL;DR

Abstract

SPEAL: Skeletal Prior Embedded Attention Learning for Cross-Source Point Cloud Registration

Authors

TL;DR

Abstract

Table of Contents

Figures (7)