Table of Contents
Fetching ...

Multiway Point Cloud Mosaicking with Diffusion and Global Optimization

Shengze Jin, Iro Armeni, Marc Pollefeys, Daniel Barath

TL;DR

The paper tackles robust multiway point cloud mosaicking for unordered, partially overlapping scans by fusing learning-based matching with classical geometry. The core is ODIN, a diffusion-enhanced, overlap-aware pairwise registration method, followed by a decoupled, diffusion-guided global optimization pipeline: global rotation averaging, optimal robust translation re-estimation, translation optimization, and diffusion-based joint pose optimization. Key contributions include the ODIN matcher with dual attention and diffusion denoising, a globally optimal translation re-estimation via maximal sphere overlaps, and a diffusion-based pose-graph optimizer conditioned on input point clouds, achieving state-of-the-art results across four large datasets. The approach substantially improves both pairwise and multiway registration accuracy, enabling reliable large-scale 3D mosaicking with practical runtimes suitable for robotics and mapping applications, as evidenced by significant reductions in rotation and translation errors on challenging benchmarks.

Abstract

We introduce a novel framework for multiway point cloud mosaicking (named Wednesday), designed to co-align sets of partially overlapping point clouds -- typically obtained from 3D scanners or moving RGB-D cameras -- into a unified coordinate system. At the core of our approach is ODIN, a learned pairwise registration algorithm that iteratively identifies overlaps and refines attention scores, employing a diffusion-based process for denoising pairwise correlation matrices to enhance matching accuracy. Further steps include constructing a pose graph from all point clouds, performing rotation averaging, a novel robust algorithm for re-estimating translations optimally in terms of consensus maximization and translation optimization. Finally, the point cloud rotations and positions are optimized jointly by a diffusion-based approach. Tested on four diverse, large-scale datasets, our method achieves state-of-the-art pairwise and multiway registration results by a large margin on all benchmarks. Our code and models are available at https://github.com/jinsz/Multiway-Point-Cloud-Mosaicking-with-Diffusion-and-Global-Optimization.

Multiway Point Cloud Mosaicking with Diffusion and Global Optimization

TL;DR

The paper tackles robust multiway point cloud mosaicking for unordered, partially overlapping scans by fusing learning-based matching with classical geometry. The core is ODIN, a diffusion-enhanced, overlap-aware pairwise registration method, followed by a decoupled, diffusion-guided global optimization pipeline: global rotation averaging, optimal robust translation re-estimation, translation optimization, and diffusion-based joint pose optimization. Key contributions include the ODIN matcher with dual attention and diffusion denoising, a globally optimal translation re-estimation via maximal sphere overlaps, and a diffusion-based pose-graph optimizer conditioned on input point clouds, achieving state-of-the-art results across four large datasets. The approach substantially improves both pairwise and multiway registration accuracy, enabling reliable large-scale 3D mosaicking with practical runtimes suitable for robotics and mapping applications, as evidenced by significant reductions in rotation and translation errors on challenging benchmarks.

Abstract

We introduce a novel framework for multiway point cloud mosaicking (named Wednesday), designed to co-align sets of partially overlapping point clouds -- typically obtained from 3D scanners or moving RGB-D cameras -- into a unified coordinate system. At the core of our approach is ODIN, a learned pairwise registration algorithm that iteratively identifies overlaps and refines attention scores, employing a diffusion-based process for denoising pairwise correlation matrices to enhance matching accuracy. Further steps include constructing a pose graph from all point clouds, performing rotation averaging, a novel robust algorithm for re-estimating translations optimally in terms of consensus maximization and translation optimization. Finally, the point cloud rotations and positions are optimized jointly by a diffusion-based approach. Tested on four diverse, large-scale datasets, our method achieves state-of-the-art pairwise and multiway registration results by a large margin on all benchmarks. Our code and models are available at https://github.com/jinsz/Multiway-Point-Cloud-Mosaicking-with-Diffusion-and-Global-Optimization.
Paper Structure (17 sections, 7 equations, 6 figures, 9 tables)

This paper contains 17 sections, 7 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: The proposed multiway registration method, Wednesday, starts with pairwise registration of an unordered set of partially overlapping point clouds using the proposed matcher (ODIN). The process then optimizes the constructed pose graph, which includes global point cloud poses (vertices) and relative transforms (edges), through a sequence of steps: (a) global rotation averaging, (b) a novel optimal robust translation re-estimation method conceptualized as finding maximal sphere overlaps, (c) position averaging, and (d) diffusion-based pose graph optimization. The output is the point clouds in a unified coordinate system.
  • Figure 2: Two-view registration. Given two points clouds as input, ODIN (Section \ref{['sec:pairwise']}) first extracts features that are then processed by geometric self-attention to learn point-specific attention features. Next, the process is separated into two parallel streams: In (a), the features are processed by explicit one-way self and cross-attentions. This process incorporates overlap scores determined in the final stage. In (b), the features directly go through cross-attention. The determined correlation matrix is the weighted average of the correlations from the two streams. A diffusion-based denoising cleans the correlations. Finally, point matching and transformation estimation are performed. The overlap scores implied by the estimated transform are sent back to the attention learning module as a mask and the process starts over.
  • Figure 3: Multiway point cloud registration results on two scenes from the challenging NSS dataset nss2023 with the recent LIRST yew2021learning and RMPR wang2023robust and the proposed methods (ceilings are not shown). Also, we visualize the provided ground truth. We show results for LIRST and RMPR as they are the best-performing alternatives in Tab. \ref{['tab:NSS_mr']}. Such results are common output of these methods on this dataset.
  • Figure 4: Qualitative Results for the 3DMatch zeng20173dmatch dataset. See Section \ref{['sec:3dmatch']} for an explanation of the results. Best viewed in screen.
  • Figure 5: Qualitative Results for the 3DLoMatch huang2021predator dataset. See Section \ref{['sec:3dlomatch']} for an explanation of the results. Best viewed in screen.
  • ...and 1 more figures