SE(3)-bi-equivariant Transformers for Point Cloud Assembly

Ziming Wang; Rebecka Jörnsten

SE(3)-bi-equivariant Transformers for Point Cloud Assembly

Ziming Wang, Rebecka Jörnsten

TL;DR

This work proposes a method, called SE(3)-bi-equivariant transformer (BITR), based on the SE(3)-bi-equivariance prior of the task, which guarantees that when the inputs are rigidly perturbed, the output will transform accordingly.

Abstract

Given a pair of point clouds, the goal of assembly is to recover a rigid transformation that aligns one point cloud to the other. This task is challenging because the point clouds may be non-overlapped, and they may have arbitrary initial positions. To address these difficulties, we propose a method, called SE(3)-bi-equivariant transformer (BITR), based on the SE(3)-bi-equivariance prior of the task: it guarantees that when the inputs are rigidly perturbed, the output will transform accordingly. Due to its equivariance property, BITR can not only handle non-overlapped PCs, but also guarantee robustness against initial positions. Specifically, BITR first extracts features of the inputs using a novel $SE(3) \times SE(3)$-transformer, and then projects the learned feature to group SE(3) as the output. Moreover, we theoretically show that swap and scale equivariances can be incorporated into BITR, thus it further guarantees stable performance under scaling and swapping the inputs. We experimentally show the effectiveness of BITR in practical tasks.

SE(3)-bi-equivariant Transformers for Point Cloud Assembly

TL;DR

Abstract

-transformer, and then projects the learned feature to group SE(3) as the output. Moreover, we theoretically show that swap and scale equivariances can be incorporated into BITR, thus it further guarantees stable performance under scaling and swapping the inputs. We experimentally show the effectiveness of BITR in practical tasks.

Paper Structure (46 sections, 15 theorems, 105 equations, 13 figures, 6 tables)

This paper contains 46 sections, 15 theorems, 105 equations, 13 figures, 6 tables.

Introduction
Related works
Preliminaries
Group representation and equivariance
Arun's method
$SE(3)$-bi-equivariant transformer
Problem formulation
$SE(3)\times SE(3)$-transformer
Point cloud merge
SE(3)-projection
Swap-equivariance and scale-equivariance
Incorporating swap-equivariance
Incorporating scale-equivariance
Experiments and analysis
Experiment settings
...and 31 more sections

Key Result

Proposition 3.2

Under a mild assumption (assumption-1), Arun's algorithm (Arun) is $SE(3)$-bi-equivariant, swap-equivariant and scale-equivariant.

Figures (13)

Figure 1: Two examples of PC assembly. Given a pair of PCs, the proposed method BITR transforms the source PC (red) to align the reference PC (blue). The input PCs may be overlapped \ref{['fig1-alignment']} or non-overlapped \ref{['fig1-assembly']}.
Figure 2: An overview of BITR. The input 3-D PCs $X$ and $Y$ are first merged into a 6-D PC $Z$ by concatenating the extracted key points $\tilde{X}$ and $\tilde{Y}$. Then, $Z$ is fed into a $SE(3)\times SE(3)$-transformer to obtain equivariant features $\hat{r}$, $t_X$ and $t_Y$. These features are finally projected to $SE(3)$ as the output.
Figure 3: The results of BITR on a test example \ref{['a-sample']}, and the swapped \ref{['b-sample']}, scaled \ref{['c-sample']} and rigidly perturbed \ref{['d-sample']} inputs. The red, yellow and blue colors represent the source, transformed source and reference PCs respectively.
Figure 4: Assembly results on the airplane dataset. $*$ denotes methods which require the true canonical poses of the input PCs.
Figure 5: A result of BITR on assembling a motorbike and a car.
...and 8 more figures

Theorems & Definitions (33)

Definition 3.1
Proposition 3.2
Proposition 4.1
Proposition 5.1
Proposition 5.2
Proposition 5.3
Definition 5.4
Proposition 5.5
Lemma C.1
proof
...and 23 more

SE(3)-bi-equivariant Transformers for Point Cloud Assembly

TL;DR

Abstract

SE(3)-bi-equivariant Transformers for Point Cloud Assembly

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (33)