Table of Contents
Fetching ...

Attention-guided reference point shifting for Gaussian-mixture-based partial point set registration

Mizuki Kikkawa, Tatsuya Yatagawa, Yutaka Ohtake, Hiromasa Suzuki

TL;DR

The paper addresses the fragility of GMM-based partial-to-partial point set registration when reference points vary across partial configurations. It proposes Attention-based Reference Point Shifting (ARPS), which uses Multi-Head Attention to locate a common reference point and re-centers point sets at the origin, enabling consistent, transformation-invariant feature extraction. The approach yields substantial gains over DeepGMR and UGMMReg on synthetic and outdoor LiDAR benchmarks, with ablations confirming the importance of MHA, centroid supervision, and progressive ref-point updating. The method enhances practical registration performance for partial data and opens avenues for extending to per-point features and multi-view scenarios.

Abstract

This study investigates the impact of the invariance of feature vectors for partial-to-partial point set registration under translation and rotation of input point sets, particularly in the realm of techniques based on deep learning and Gaussian mixture models (GMMs). We reveal both theoretical and practical problems associated with such deep-learning-based registration methods using GMMs, with a particular focus on the limitations of DeepGMR, a pioneering study in this line, to the partial-to-partial point set registration. Our primary goal is to uncover the causes behind such methods and propose a comprehensible solution for that. To address this, we introduce an attention-based reference point shifting (ARPS) layer, which robustly identifies a common reference point of two partial point sets, thereby acquiring transformation-invariant features. The ARPS layer employs a well-studied attention module to find a common reference point rather than the overlap region. Owing to this, it significantly enhances the performance of DeepGMR and its recent variant, UGMMReg. Furthermore, these extension models outperform even prior deep learning methods using attention blocks and Transformer to extract the overlap region or common reference points. We believe these findings provide deeper insights into registration methods using deep learning and GMMs.

Attention-guided reference point shifting for Gaussian-mixture-based partial point set registration

TL;DR

The paper addresses the fragility of GMM-based partial-to-partial point set registration when reference points vary across partial configurations. It proposes Attention-based Reference Point Shifting (ARPS), which uses Multi-Head Attention to locate a common reference point and re-centers point sets at the origin, enabling consistent, transformation-invariant feature extraction. The approach yields substantial gains over DeepGMR and UGMMReg on synthetic and outdoor LiDAR benchmarks, with ablations confirming the importance of MHA, centroid supervision, and progressive ref-point updating. The method enhances practical registration performance for partial data and opens avenues for extending to per-point features and multi-view scenarios.

Abstract

This study investigates the impact of the invariance of feature vectors for partial-to-partial point set registration under translation and rotation of input point sets, particularly in the realm of techniques based on deep learning and Gaussian mixture models (GMMs). We reveal both theoretical and practical problems associated with such deep-learning-based registration methods using GMMs, with a particular focus on the limitations of DeepGMR, a pioneering study in this line, to the partial-to-partial point set registration. Our primary goal is to uncover the causes behind such methods and propose a comprehensible solution for that. To address this, we introduce an attention-based reference point shifting (ARPS) layer, which robustly identifies a common reference point of two partial point sets, thereby acquiring transformation-invariant features. The ARPS layer employs a well-studied attention module to find a common reference point rather than the overlap region. Owing to this, it significantly enhances the performance of DeepGMR and its recent variant, UGMMReg. Furthermore, these extension models outperform even prior deep learning methods using attention blocks and Transformer to extract the overlap region or common reference points. We believe these findings provide deeper insights into registration methods using deep learning and GMMs.

Paper Structure

This paper contains 20 sections, 13 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Attention-based reference point shifting (ARPS), proposed in this study, shifts the point sets such that the estimated reference point conforms at the origin, and encodes point positions to obtain point features invariant to rotation and translation. DeepGMR method enhanced by ARPS (DGRM-ARPS) registers point sets more accurately than the traditional method (JRMPC) and deep-learning-based method (DeepGMR) using Gaussian mixture models. In this figure, source and target point sets are represented by red and blue points, and their centroids are by large balls with respective colors, while the green ball represents the origin. ARPS moves these centroids toward the origin.
  • Figure 2: We categorize input point sets based on the point arrangement after registration. Duplicated point sets can be aligned perfectly such that all the source points coincide with the target points after the registration. Unduplicated point sets both represent the global shape of an object but do not have the same point configuration, and their points are not aligned perfectly by registration. Partial point sets represent different parts of the object's geometry, and only parts of their shapes overlap even after registration.
  • Figure 3: Detailed construction of the ARPS layer. One of the inputs for the cross-attention blocks is obtained from the other point set. The features with a $\star$ correspond to each other.
  • Figure 4: Overall network architecture for GMM-based point set registration using ARPS operations. The network consists of several ARPS layers and a Gaussian mixture registration (GMR) module. In each ARPS layer, a new reference point of each point set is estimated with the features enhanced by attention blocks. Then, the point set is shifted such that the estimated reference point coincides with the origin. Details of ARPS layers are given in \ref{['fig:arps-layer']}
  • Figure 5: Benchmarking on ModelNet20 with Gaussian noise. GMM-based methods are marked with an asterisk ${}^*$.
  • ...and 4 more figures