Table of Contents
Fetching ...

Rectified Point Flow: Generic Point Cloud Pose Estimation

Tao Sun, Liyuan Zhu, Shengyu Huang, Shuran Song, Iro Armeni

TL;DR

Rectified Point Flow reframes point cloud pose estimation as a conditional generative problem over assembled shapes by learning a dense point-wise velocity field that moves points from noise toward their target configuration. The method uses a two-stage pipeline: an overlap-aware encoder to capture inter-part relations and a flow model to reconstruct the assembled state, with a CF loss that enables stable training. A key contribution is the intrinsic handling of symmetry and part interchangeability through a group-theoretic invariance, plus a joint-training strategy across diverse datasets to learn shared geometric priors. Empirically, the approach achieves state-of-the-art performance on six benchmarks spanning pairwise registration and multi-part shape assembly, and demonstrates strong generalization, symmetry handling, and the ability to generate multiple plausible assemblies under the same input. This framework has practical implications for robotics and digital fabrication by enabling robust, scalable 3D alignment and assembly from unposed scans with improved symmetry-aware reasoning.

Abstract

We introduce Rectified Point Flow, a unified parameterization that formulates pairwise point cloud registration and multi-part shape assembly as a single conditional generative problem. Given unposed point clouds, our method learns a continuous point-wise velocity field that transports noisy points toward their target positions, from which part poses are recovered. In contrast to prior work that regresses part-wise poses with ad-hoc symmetry handling, our method intrinsically learns assembly symmetries without symmetry labels. Together with a self-supervised encoder focused on overlapping points, our method achieves a new state-of-the-art performance on six benchmarks spanning pairwise registration and shape assembly. Notably, our unified formulation enables effective joint training on diverse datasets, facilitating the learning of shared geometric priors and consequently boosting accuracy. Project page: https://rectified-pointflow.github.io/.

Rectified Point Flow: Generic Point Cloud Pose Estimation

TL;DR

Rectified Point Flow reframes point cloud pose estimation as a conditional generative problem over assembled shapes by learning a dense point-wise velocity field that moves points from noise toward their target configuration. The method uses a two-stage pipeline: an overlap-aware encoder to capture inter-part relations and a flow model to reconstruct the assembled state, with a CF loss that enables stable training. A key contribution is the intrinsic handling of symmetry and part interchangeability through a group-theoretic invariance, plus a joint-training strategy across diverse datasets to learn shared geometric priors. Empirically, the approach achieves state-of-the-art performance on six benchmarks spanning pairwise registration and multi-part shape assembly, and demonstrates strong generalization, symmetry handling, and the ability to generate multiple plausible assemblies under the same input. This framework has practical implications for robotics and digital fabrication by enabling robust, scalable 3D alignment and assembly from unposed scans with improved symmetry-aware reasoning.

Abstract

We introduce Rectified Point Flow, a unified parameterization that formulates pairwise point cloud registration and multi-part shape assembly as a single conditional generative problem. Given unposed point clouds, our method learns a continuous point-wise velocity field that transports noisy points toward their target positions, from which part poses are recovered. In contrast to prior work that regresses part-wise poses with ad-hoc symmetry handling, our method intrinsically learns assembly symmetries without symmetry labels. Together with a self-supervised encoder focused on overlapping points, our method achieves a new state-of-the-art performance on six benchmarks spanning pairwise registration and shape assembly. Notably, our unified formulation enables effective joint training on diverse datasets, facilitating the learning of shared geometric priors and consequently boosting accuracy. Project page: https://rectified-pointflow.github.io/.

Paper Structure

This paper contains 47 sections, 2 theorems, 19 equations, 12 figures, 12 tables.

Key Result

Theorem 1

For every element $g \in \mathcal{G}$, we have the learning objective in Eq. eq:fm following $\mathcal{L}_\mathrm{CFM}(\bm{V}) = \mathcal{L}_\mathrm{CFM}(g (\bm{V}(t, \{\bm{X}_i(t) \}_{i\in\Omega} ; g(\bm{X})))).$

Figures (12)

  • Figure 1: Rectified Point Flow's Pose-from-Shape Pipeline. Our formulation supports shape assembly (first row) and pairwise registration (second row) tasks in a single framework. Given a set of unposed part point clouds $\{\bar{\bm{X}}_i\}_{i\in\Omega}$, Rectified Point Flow predicts each part's point cloud at the target assembled state $\{\hat{\bm{X}}_i{(0)}\}_{i\in\Omega}$. Subsequently, we solve Procrustes problem via SVD between the condition point cloud $\bar{\bm{X}}_i$ and the estimated point cloud $\hat{\bm{X}}_i(0)$ to recover the rigid transformation $\hat{T}_i$ for each non-anchored part.
  • Figure 2: Encoder pre-training via overlap points prediction. Given unposed multi-part point clouds, our encoder with a point-wise overlap prediction head performs a binary classification to identify overlapping points. Predicted overlap points are shown in blue. For comparison, the ground-truth overlap points are visualized on the assembled object for clarity (target overlap).
  • Figure 3: Learning Rectified Point Flow. The input to Rectified Point Flow are the condition point clouds $\{\tilde{\bm{X}}_i\}_{i\in\Omega}$ and noised point clouds $\{\bm{X}_i(t)\}_{i\in\Omega}$ at timestep $t$. They are first encoded by the pre-trained encoder and the positional encoding, respectively. The encoded features are concatenated and passed through the flow model, which predicts per-point velocity vectors $\{\mathrm{d} \bm{X}_i(t)/ \mathrm{d} t\}_{i\in\Omega}$ and defines the flow used to predict the part point cloud in its assembled state.
  • Figure 4: Qualitative Results on PartNet-Assembly. Columns show objects with increasing number of parts (left to right). Rows display (1) colored input point clouds of each part, (2) GARF outputs (dashed boxes indicate samples limited to 20 by GARF’s design, selecting the top 20 parts by volume), (3) Rectified Point Flow outputs, and (4) ground-truth assemblies. Compared to GARF, our method produces more accurate pose estimation on most parts, especially as the number of parts increases.
  • Figure 5: Qualitative Results Across Registration and Assembly Tasks. From left to right: pairwise registration on ModelNet 40 and TUD-L, shape assembly on BreakingBad-Everyday. From top to bottom: Colored input point clouds, GARF results, ours, and ground truth (GT). Our single model performs the best across registration and assembly tasks.
  • ...and 7 more figures

Theorems & Definitions (4)

  • Theorem 1: $\mathcal{G}$‑invariance of the learning objective
  • Definition 1: Assembly symmetry group
  • Lemma 1: $\mathcal{G}$‑invariance of the flow distribution
  • proof