Table of Contents
Fetching ...

SGNet: Salient Geometric Network for Point Cloud Registration

Qianliang Wu, Yaqing Ding, Lei Luo, Haobo Jiang, Shuo Gu, Chuanwei Zhou, Jin Xie, Jian Yang

TL;DR

A semantic-aware geometric encoder that combines object-level and patch-level semantic information and an innovative transformer that encodes High-Order geometric features are introduced that are crucial for identifying salient points within initial overlap regions while considering global high-order geometric consistency.

Abstract

Point Cloud Registration (PCR) is a critical and challenging task in computer vision. One of the primary difficulties in PCR is identifying salient and meaningful points that exhibit consistent semantic and geometric properties across different scans. Previous methods have encountered challenges with ambiguous matching due to the similarity among patch blocks throughout the entire point cloud and the lack of consideration for efficient global geometric consistency. To address these issues, we propose a new framework that includes several novel techniques. Firstly, we introduce a semantic-aware geometric encoder that combines object-level and patch-level semantic information. This encoder significantly improves registration recall by reducing ambiguity in patch-level superpoint matching. Additionally, we incorporate a prior knowledge approach that utilizes an intrinsic shape signature to identify salient points. This enables us to extract the most salient super points and meaningful dense points in the scene. Secondly, we introduce an innovative transformer that encodes High-Order (HO) geometric features. These features are crucial for identifying salient points within initial overlap regions while considering global high-order geometric consistency. To optimize this high-order transformer further, we introduce an anchor node selection strategy. By encoding inter-frame triangle or polyhedron consistency features based on these anchor nodes, we can effectively learn high-order geometric features of salient super points. These high-order features are then propagated to dense points and utilized by a Sinkhorn matching module to identify key correspondences for successful registration. In our experiments conducted on well-known datasets such as 3DMatch/3DLoMatch and KITTI, our approach has shown promising results, highlighting the effectiveness of our novel method.

SGNet: Salient Geometric Network for Point Cloud Registration

TL;DR

A semantic-aware geometric encoder that combines object-level and patch-level semantic information and an innovative transformer that encodes High-Order geometric features are introduced that are crucial for identifying salient points within initial overlap regions while considering global high-order geometric consistency.

Abstract

Point Cloud Registration (PCR) is a critical and challenging task in computer vision. One of the primary difficulties in PCR is identifying salient and meaningful points that exhibit consistent semantic and geometric properties across different scans. Previous methods have encountered challenges with ambiguous matching due to the similarity among patch blocks throughout the entire point cloud and the lack of consideration for efficient global geometric consistency. To address these issues, we propose a new framework that includes several novel techniques. Firstly, we introduce a semantic-aware geometric encoder that combines object-level and patch-level semantic information. This encoder significantly improves registration recall by reducing ambiguity in patch-level superpoint matching. Additionally, we incorporate a prior knowledge approach that utilizes an intrinsic shape signature to identify salient points. This enables us to extract the most salient super points and meaningful dense points in the scene. Secondly, we introduce an innovative transformer that encodes High-Order (HO) geometric features. These features are crucial for identifying salient points within initial overlap regions while considering global high-order geometric consistency. To optimize this high-order transformer further, we introduce an anchor node selection strategy. By encoding inter-frame triangle or polyhedron consistency features based on these anchor nodes, we can effectively learn high-order geometric features of salient super points. These high-order features are then propagated to dense points and utilized by a Sinkhorn matching module to identify key correspondences for successful registration. In our experiments conducted on well-known datasets such as 3DMatch/3DLoMatch and KITTI, our approach has shown promising results, highlighting the effectiveness of our novel method.
Paper Structure (26 sections, 10 equations, 6 figures, 4 tables)

This paper contains 26 sections, 10 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Our work is inspired by a challenging scenario where successful registration critically depends on the prominent and essential points found on the table, as illustrated in the leftmost region of (a). However, these points are spatially far apart from the abundant points situated on the wall (i.e., points in the rightmost region of (a)). To tackle this challenge, we introduce a new high-order (HO) geometric transformer that leverages high-order geometric features to effectively address such scenarios.
  • Figure 2: Our proposed framework comprises several components. First, the point clouds ${P}$ and ${Q}$ are fed to the down-sampling layers to obtain the super points ($\hat{{P}}$,$\hat{{Q}}$) and their features ($\mathbf{F}^{\hat{{P}}}$,$\mathbf{F}^{\hat{{Q}}}$). We then use the semantic-aware geometric encoder to detect overlap regions. Next, our newly designed high-order geometric transformer is applied to encode the high-order inter-frame geometric salient feature ($\mathbf{F}^{\hat{{P}}}_o$,$\mathbf{F}^{\hat{{Q}}}_o$). We combine the ($\mathbf{F}^{\hat{{P}}}$,$\mathbf{F}^{\hat{{Q}}}$) and ($\mathbf{F}^{\hat{{P}}}_o$,$\mathbf{F}^{\hat{{Q}}}_o$) and propagate this combined features up to the dense level points in the overlap region. Finally, the local-to-global registration step computes the 6D rigid transformation ${\mathbf{R},\mathbf{t}}$. Zoom in for details.
  • Figure 3: Anchor points selection. The candidate overlap region consists of green and blue points corresponding to each other (indicated by the black dashed curve representing initial candidate correspondences). From this set of points, a subset of anchor points (highlighted within the pinkish-red dashed box) is selected based on their maximum distance from ${\hat{p}}_i$ and ${\hat{q}}_{i^{'}}$. In an example scenario where $K=3$, two anchor points generate up to four triangles for ${\hat{p}}_i$ and ${\hat{p}}_j$. Consequently, the angles in these four triangles can be pooled together to aggregate the embeddings for ${\hat{p}}_i$ and ${\hat{q}}_{i^{'}}$.
  • Figure 4: The visualizations showcase the patch correspondences achieved by our model on the 3DMatch dataset. The registration results in the top line were generated using GeoTRqin2022geometric+LGR, while the results in the bottom line were generated using GeoTR qin2022geometric + High-Order Transformer + LGR. The inlier/outlier correspondences are highlighted in green/red. These visualizations serve to demonstrate the effectiveness of our high-order transformer in eliminating geometric inconsistencies in matches. Please zoom in for more detailed observations.
  • Figure 5: This visualization shows that the high-level semantic encoder captures the semantic information of the wall. HO, AUG, and SE denote the high-order transformer, superpoint augmentation, and high-level semantic encoder. The inlier/outlier correspondences are highlighted in green/red. Please zoom in for more detailed observations.
  • ...and 1 more figures