Table of Contents
Fetching ...

IGASA: Integrated Geometry-Aware and Skip-Attention Modules for Enhanced Point Cloud Registration

Dongxu Zhang, Jihua Zhu, Shiqi Li, Wenbiao Yan, Haoran Xu, Peilin Fan, Huimin Lu

Abstract

Point cloud registration (PCR) is a fundamental task in 3D vision and provides essential support for applications such as autonomous driving, robotics, and environmental modeling. Despite its widespread use, existing methods often fail when facing real-world challenges like heavy noise, significant occlusions, and large-scale transformations. These limitations frequently result in compromised registration accuracy and insufficient robustness in complex environments. In this paper, we propose IGASA as a novel registration framework constructed upon a Hierarchical Pyramid Architecture (HPA) designed for robust multi-scale feature extraction and fusion. The framework integrates two pivotal components consisting of the Hierarchical Cross-Layer Attention (HCLA) module and the Iterative Geometry-Aware Refinement (IGAR) module. The HCLA module utilizes skip attention mechanisms to align multi-resolution features and enhance local geometric consistency. Simultaneously, the IGAR module is designed for the fine matching phase by leveraging reliable correspondences established during coarse matching. This synergistic integration within the architecture allows IGASA to adapt effectively to diverse point cloud structures and intricate transformations. We evaluate the performance of IGASA on four widely recognized benchmark datasets including 3D(Lo)Match, KITTI, and nuScenes. Our extensive experiments consistently demonstrate that IGASA significantly surpasses state-of-the-art methods and achieves notable improvements in registration accuracy. This work provides a robust foundation for advancing point cloud registration techniques while offering valuable insights for practical 3D vision applications. The code for IGASA is available in \href{https://github.com/DongXu-Zhang/IGASA}{https://github.com/DongXu-Zhang/IGASA}.

IGASA: Integrated Geometry-Aware and Skip-Attention Modules for Enhanced Point Cloud Registration

Abstract

Point cloud registration (PCR) is a fundamental task in 3D vision and provides essential support for applications such as autonomous driving, robotics, and environmental modeling. Despite its widespread use, existing methods often fail when facing real-world challenges like heavy noise, significant occlusions, and large-scale transformations. These limitations frequently result in compromised registration accuracy and insufficient robustness in complex environments. In this paper, we propose IGASA as a novel registration framework constructed upon a Hierarchical Pyramid Architecture (HPA) designed for robust multi-scale feature extraction and fusion. The framework integrates two pivotal components consisting of the Hierarchical Cross-Layer Attention (HCLA) module and the Iterative Geometry-Aware Refinement (IGAR) module. The HCLA module utilizes skip attention mechanisms to align multi-resolution features and enhance local geometric consistency. Simultaneously, the IGAR module is designed for the fine matching phase by leveraging reliable correspondences established during coarse matching. This synergistic integration within the architecture allows IGASA to adapt effectively to diverse point cloud structures and intricate transformations. We evaluate the performance of IGASA on four widely recognized benchmark datasets including 3D(Lo)Match, KITTI, and nuScenes. Our extensive experiments consistently demonstrate that IGASA significantly surpasses state-of-the-art methods and achieves notable improvements in registration accuracy. This work provides a robust foundation for advancing point cloud registration techniques while offering valuable insights for practical 3D vision applications. The code for IGASA is available in \href{https://github.com/DongXu-Zhang/IGASA}{https://github.com/DongXu-Zhang/IGASA}.
Paper Structure (25 sections, 19 equations, 6 figures, 7 tables)

This paper contains 25 sections, 19 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: The Inlier Ratio (IR) is plotted on the x-axis for 3DMatch and on the y-axis for 3DLoMatch. IGASA stands out by consistently achieving the highest IR.
  • Figure 2: The overview of the proposed IGASA framework. The pipeline operates in three stages. (a), the HPA module processes the source $P$ and target $Q$ inputs to construct a hierarchical feature pyramid ($F_{ordinary}, F_{minor}, F_{primary}$) with progressively expanding receptive fields. (b), the HCLA module performs coarse matching by utilizing the SGIRA to fuse global semantics with local geometry, followed by the SAIGA for feature refinement. (c), the IGAR module executes the fine registration phase, where an iterative optimization loop (repeated $N$ times) dynamically updates correspondence weights to suppress outliers and estimate the final transformation $\{R, t\}$.
  • Figure 3: Details of Gated Fusion Mechanism. $F_p^n$ and $F_q^n$ are preprocessed, passed through parallel convolutional layers, combined to form $F_t$, and then refined by a residual adjustment to produce $G_{\text{fusion}}$ for weighted fusion.
  • Figure 4: The result of IGASA on 3D(Lo)Match. We visualized some of the scenes in the dataset. The first row shows the initial arrangement of the two point clouds in their raw, unaligned positions.The second row displays the registration result predicted by our model.
  • Figure 5: Qualitative registration results on the KITTI odometry benchmark. We visualize four representative driving scenes characterized by large spatial extents and varying structural distinctiveness. The first row displays the initial unaligned input pairs (Source: Blue, Target: Yellow), illustrating significant initial pose discrepancies. The second row shows the registration results achieved by IGASA. The red rectangular boxes highlight specific regions with sharp structural boundaries, such as road curbs and intersection corners. Note that despite the limited overlap and noise typical of outdoor LiDAR scans, our method precisely aligns fine-grained structures such as trees, vehicles, and road boundaries.
  • ...and 1 more figures