Table of Contents
Fetching ...

Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration

Xueyang Kang, Zhaoliang Luan, Kourosh Khoshelham, Bing Wang

TL;DR

This work tackles robust sparse point cloud registration by introducing an $SE(3)$-equivariant graph network that jointly learns local geometric descriptors, propagates SE(3) equivariant features via graph neural networks, and employs a Low-Rank Feature Transformation to produce compact, robust descriptors for similarity-based pose estimation. Key contributions include the LRFT module, a local $SO(3)$-invariant frame projection for equivariant message passing, and a rank-based regularizer with submatrix checks to suppress outliers without requiring explicit point correspondences. The method achieves state-of-the-art performance on indoor 3DMatch and strong recall on outdoor KITTI while maintaining low latency, demonstrating data efficiency and potential for real-time registration. Overall, the paper advances registration by integrating symmetry-aware representations with efficient matching, enabling robust alignment from sparsely sampled points and opening avenues for permutation-invariant extensions and multi-sensor fusion.

Abstract

Point cloud registration is a foundational task for 3D alignment and reconstruction applications. While both traditional and learning-based registration approaches have succeeded, leveraging the intrinsic symmetry of point cloud data, including rotation equivariance, has received insufficient attention. This prohibits the model from learning effectively, resulting in a requirement for more training data and increased model complexity. To address these challenges, we propose a graph neural network model embedded with a local Spherical Euclidean 3D equivariance property through SE(3) message passing based propagation. Our model is composed mainly of a descriptor module, equivariant graph layers, match similarity, and the final regression layers. Such modular design enables us to utilize sparsely sampled input points and initialize the descriptor by self-trained or pre-trained geometric feature descriptors easily. Experiments conducted on the 3DMatch and KITTI datasets exhibit the compelling and robust performance of our model compared to state-of-the-art approaches, while the model complexity remains relatively low at the same time.

Equi-GSPR: Equivariant SE(3) Graph Network Model for Sparse Point Cloud Registration

TL;DR

This work tackles robust sparse point cloud registration by introducing an -equivariant graph network that jointly learns local geometric descriptors, propagates SE(3) equivariant features via graph neural networks, and employs a Low-Rank Feature Transformation to produce compact, robust descriptors for similarity-based pose estimation. Key contributions include the LRFT module, a local -invariant frame projection for equivariant message passing, and a rank-based regularizer with submatrix checks to suppress outliers without requiring explicit point correspondences. The method achieves state-of-the-art performance on indoor 3DMatch and strong recall on outdoor KITTI while maintaining low latency, demonstrating data efficiency and potential for real-time registration. Overall, the paper advances registration by integrating symmetry-aware representations with efficient matching, enabling robust alignment from sparsely sampled points and opening avenues for permutation-invariant extensions and multi-sensor fusion.

Abstract

Point cloud registration is a foundational task for 3D alignment and reconstruction applications. While both traditional and learning-based registration approaches have succeeded, leveraging the intrinsic symmetry of point cloud data, including rotation equivariance, has received insufficient attention. This prohibits the model from learning effectively, resulting in a requirement for more training data and increased model complexity. To address these challenges, we propose a graph neural network model embedded with a local Spherical Euclidean 3D equivariance property through SE(3) message passing based propagation. Our model is composed mainly of a descriptor module, equivariant graph layers, match similarity, and the final regression layers. Such modular design enables us to utilize sparsely sampled input points and initialize the descriptor by self-trained or pre-trained geometric feature descriptors easily. Experiments conducted on the 3DMatch and KITTI datasets exhibit the compelling and robust performance of our model compared to state-of-the-art approaches, while the model complexity remains relatively low at the same time.
Paper Structure (13 sections, 10 equations, 6 figures, 4 tables)

This paper contains 13 sections, 10 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The registration model converts the sparse point descriptors of the source and target frames into an equivariant graph feature representation, respectively. Then the $\mathbf{SE}(3)$ equivariant graph features are used for the similarity score calculation. The matched features are then decoded into the relative transform to align the two scans.
  • Figure 2: The registration model consists of an encoder, a feature match block, and a decoder. Pointwise feature descriptors are extracted from the source and target scan points, passed through equivariant graph layers, and combined with coordinate embeddings to form a row-major order matrix. Next, the feature matrices from the source and target frames are compressed using MLPs-based Low-Rank Feature Transformation (LRFT). The aggregated features are used to create a similarity map through dot product of feature descriptors. In the decoder, features are weighted by similarity scores, then concatenated, and processed through pooling and fully connected layers to predict relative translation $t_j^i$ and quaternion $q_j^i$.
  • Figure 3: Reducing the feature number through low-rank using MLP layers (a), and examining the similarity score matrix with submatrices for rank verification at bottom right ($5\times 5$) and center ($7\times 7$) of yellow dashed region as illustrated in subfigure (b).
  • Figure 4: The visual registration results of the proposed model on 3DMatch zeng20173dmatch and KITTI geiger2012we are illustrated in the registration samples. Points from the target frame are represented in blue, whereas points converted from the source frame to the target frame by the predicted transform are visualized in yellow.
  • Figure 5: The t-SNE comparisons of equi-features outputs.
  • ...and 1 more figures