Table of Contents
Fetching ...

P2P-Bridge: Diffusion Bridges for 3D Point Cloud Denoising

Mathias Vogel, Keisuke Tateno, Marc Pollefeys, Federico Tombari, Marie-Julie Rakotosaona, Francis Engelmann

TL;DR

The paper addresses 3D point cloud denoising under real-world scanner noise by framing denoising as a diffusion Schrödinger bridge that learns an optimal transport plan between paired clean and noisy clouds. It introduces a data-to-data diffusion process, uses shortest-path interpolation to align point sets, and reduces the stochastic process to an OT-ODE when stochasticity vanishes, enabling efficient training and inference. The approach, called P2P-Bridge, leverages PVCNN-based architectures and can incorporate RGB or high-level DINOV2 features to boost performance, achieving state-of-the-art results on object-level datasets (PU-Net, PC-Net) and indoor scenes (ScanNet++, ARKitScenes) with as few as 5–10 inference steps. The work emphasizes the importance of data alignment, tractable diffusion bridges, and feature integration for robust denoising, offering a practical, scalable solution with open-source code and pretrained models.

Abstract

In this work, we tackle the task of point cloud denoising through a novel framework that adapts Diffusion Schrödinger bridges to points clouds. Unlike previous approaches that predict point-wise displacements from point features or learned noise distributions, our method learns an optimal transport plan between paired point clouds. Experiments on object datasets like PU-Net and real-world datasets such as ScanNet++ and ARKitScenes show that P2P-Bridge achieves significant improvements over existing methods. While our approach demonstrates strong results using only point coordinates, we also show that incorporating additional features, such as color information or point-wise DINOv2 features, further enhances the performance. Code and pretrained models are available at https://p2p-bridge.github.io.

P2P-Bridge: Diffusion Bridges for 3D Point Cloud Denoising

TL;DR

The paper addresses 3D point cloud denoising under real-world scanner noise by framing denoising as a diffusion Schrödinger bridge that learns an optimal transport plan between paired clean and noisy clouds. It introduces a data-to-data diffusion process, uses shortest-path interpolation to align point sets, and reduces the stochastic process to an OT-ODE when stochasticity vanishes, enabling efficient training and inference. The approach, called P2P-Bridge, leverages PVCNN-based architectures and can incorporate RGB or high-level DINOV2 features to boost performance, achieving state-of-the-art results on object-level datasets (PU-Net, PC-Net) and indoor scenes (ScanNet++, ARKitScenes) with as few as 5–10 inference steps. The work emphasizes the importance of data alignment, tractable diffusion bridges, and feature integration for robust denoising, offering a practical, scalable solution with open-source code and pretrained models.

Abstract

In this work, we tackle the task of point cloud denoising through a novel framework that adapts Diffusion Schrödinger bridges to points clouds. Unlike previous approaches that predict point-wise displacements from point features or learned noise distributions, our method learns an optimal transport plan between paired point clouds. Experiments on object datasets like PU-Net and real-world datasets such as ScanNet++ and ARKitScenes show that P2P-Bridge achieves significant improvements over existing methods. While our approach demonstrates strong results using only point coordinates, we also show that incorporating additional features, such as color information or point-wise DINOv2 features, further enhances the performance. Code and pretrained models are available at https://p2p-bridge.github.io.
Paper Structure (14 sections, 10 equations, 7 figures, 5 tables)

This paper contains 14 sections, 10 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Illustration of P2P-Bridge applied to a noisy LIDAR scan.
  • Figure 2: Illustration of P2P-Bridge, modeling point cloud denoising as a reverse data-to-data diffusion process. Our model can effectively transform noisy data into cleaner data by learning a bridge between clean and noisy data.
  • Figure 3: Our network architecture is based on PointVoxelConvolutions (PVC) pvcnn. We adapt the network implementation from LION zeng2022lion, augmenting it with multi-headed global attention and a feature embedding module. Both the feature embedding and the final shared MLP block are implemented using $1 \times 1$ convolutions.
  • Figure 4: Qualitative comparison of our P2P-Bridge with recent deep-learning-based point cloud denoising methods on the PU-Net dataset under 3% isotropic Gaussian noise.
  • Figure 5: Qualitative comparison on the ScanNet++ dataset yeshwanthliu2023scannetpp (top 3 rows) and the ARKitScenes dataset dehghan2021arkitscenes (bottom 2 rows) using noisy iPhone scans as input.
  • ...and 2 more figures