Table of Contents
Fetching ...

SE3ET: SE(3)-Equivariant Transformer for Low-Overlap Point Cloud Registration

Chien Erh Lin, Minghan Zhu, Maani Ghaffari

TL;DR

SE3ET tackles partial-to-partial point cloud registration under large transformations and low overlap by leveraging SE(3)-equivariant learning through an encoder–decoder based on E2PN and a specialized equivariant transformer. It introduces four transformer designs (ESA, ICA, ACA, RCA) and two configurations (SE3ET-E and SE3ET-I) built around an octahedral rotation group with $A=6$ for efficiency. The approach achieves state-of-the-art robustness on indoor 3DMatch/3DLoMatch under rotations and strong results on outdoor KITTI, with favorable run-time and generalization properties. The work advances robust 3D registration by maintaining geometric structure through equivariant representations and provides open-source code for reproducibility.

Abstract

Partial point cloud registration is a challenging problem in robotics, especially when the robot undergoes a large transformation, causing a significant initial pose error and a low overlap between measurements. This work proposes exploiting equivariant learning from 3D point clouds to improve registration robustness. We propose SE3ET, an SE(3)-equivariant registration framework that employs equivariant point convolution and equivariant transformer designs to learn expressive and robust geometric features. We tested the proposed registration method on indoor and outdoor benchmarks where the point clouds are under arbitrary transformations and low overlapping ratios. We also provide generalization tests and run-time performance.

SE3ET: SE(3)-Equivariant Transformer for Low-Overlap Point Cloud Registration

TL;DR

SE3ET tackles partial-to-partial point cloud registration under large transformations and low overlap by leveraging SE(3)-equivariant learning through an encoder–decoder based on E2PN and a specialized equivariant transformer. It introduces four transformer designs (ESA, ICA, ACA, RCA) and two configurations (SE3ET-E and SE3ET-I) built around an octahedral rotation group with for efficiency. The approach achieves state-of-the-art robustness on indoor 3DMatch/3DLoMatch under rotations and strong results on outdoor KITTI, with favorable run-time and generalization properties. The work advances robust 3D registration by maintaining geometric structure through equivariant representations and provides open-source code for reproducibility.

Abstract

Partial point cloud registration is a challenging problem in robotics, especially when the robot undergoes a large transformation, causing a significant initial pose error and a low overlap between measurements. This work proposes exploiting equivariant learning from 3D point clouds to improve registration robustness. We propose SE3ET, an SE(3)-equivariant registration framework that employs equivariant point convolution and equivariant transformer designs to learn expressive and robust geometric features. We tested the proposed registration method on indoor and outdoor benchmarks where the point clouds are under arbitrary transformations and low overlapping ratios. We also provide generalization tests and run-time performance.
Paper Structure (23 sections, 17 equations, 4 figures, 7 tables)

This paper contains 23 sections, 17 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: SE3ET can register two low-overlap point clouds with significant rotations and translations. This qualitative result is performed on rotated 3DLoMatch, where the first row is an easy example (28.72 % overlapping ratio with multiple overlapping surfaces), the middle row is a moderate example (10.51 % overlapping ratio with multiple overlapping surfaces), and the last row is a challenging example (26.75 % overlapping ratio with only one overlapping surface).
  • Figure 2: The proposed point cloud registration framework includes a $\mathrm{SE}(3)$-equivariant feature encoder and decoder and an equivariant transformer design for learning the point correspondences of superpoints. The dark blue blocks and arrows are equivariant, and the green blocks and arrows are invariant. We propose two transformers structures SE3ET-E and SE3ET-I. Here, SA stands for self-attention, and CA stands for cross-attention.
  • Figure 3: The geometric (octahedron shape) and algebraic (color bricks) illustration of using permutation to recover discretized rotation group. Each color represents the feature of one anchor (vertex), and the different order of the combination of features represents the discretized rotation defined in the network. If the octahedron is rotated 90 degrees clockwise, the order of the features in the anchor dimension changes accordingly. The discretization of the rotation groups is derived from the permutation of the discrete anchors. For $A = 6$, the rotation group contains 24 rotations.
  • Figure 4: An illustration of the equivariant anchor-based cross-attention (ACA) and rotation-based cross-attention modules (RCA).