Table of Contents
Fetching ...

Simba: Towards High-Fidelity and Geometrically-Consistent Point Cloud Completion via Transformation Diffusion

Lirui Zhang, Zhengkai Zhao, Zhi Zuo, Pan Gao, Jie Qin

TL;DR

This work targets the challenge of completing partial point clouds while preserving fine input details and global structure. It reframes point-wise transformation regression as a conditional diffusion problem over transformation fields, enabling robust geometric priors to be learned without overfitting. The two-stage framework—Stage1 SymmGT for transforming-field supervision and Stage2 Sym-Diffuser with a cascaded MBA-Refiner—delivers high-fidelity, geometrically consistent completions and demonstrates SOTA performance across PCN, ShapeNet, and KITTI, including strong synthetic-to-real transfer. The approach offers a new direction for diffusion-based shape completion, with potential for improved robustness and generalization in 3D perception tasks.

Abstract

Point cloud completion is a fundamental task in 3D vision. A persistent challenge in this field is simultaneously preserving fine-grained details present in the input while ensuring the global structural integrity of the completed shape. While recent works leveraging local symmetry transformations via direct regression have significantly improved the preservation of geometric structure details, these methods suffer from two major limitations: (1) These regression-based methods are prone to overfitting which tend to memorize instant-specific transformations instead of learning a generalizable geometric prior. (2) Their reliance on point-wise transformation regression lead to high sensitivity to input noise, severely degrading their robustness and generalization. To address these challenges, we introduce Simba, a novel framework that reformulates point-wise transformation regression as a distribution learning problem. Our approach integrates symmetry priors with the powerful generative capabilities of diffusion models, avoiding instance-specific memorization while capturing robust geometric structures. Additionally, we introduce a hierarchical Mamba-based architecture to achieve high-fidelity upsampling. Extensive experiments across the PCN, ShapeNet, and KITTI benchmarks validate our method's state-of-the-art (SOTA) performance.

Simba: Towards High-Fidelity and Geometrically-Consistent Point Cloud Completion via Transformation Diffusion

TL;DR

This work targets the challenge of completing partial point clouds while preserving fine input details and global structure. It reframes point-wise transformation regression as a conditional diffusion problem over transformation fields, enabling robust geometric priors to be learned without overfitting. The two-stage framework—Stage1 SymmGT for transforming-field supervision and Stage2 Sym-Diffuser with a cascaded MBA-Refiner—delivers high-fidelity, geometrically consistent completions and demonstrates SOTA performance across PCN, ShapeNet, and KITTI, including strong synthetic-to-real transfer. The approach offers a new direction for diffusion-based shape completion, with potential for improved robustness and generalization in 3D perception tasks.

Abstract

Point cloud completion is a fundamental task in 3D vision. A persistent challenge in this field is simultaneously preserving fine-grained details present in the input while ensuring the global structural integrity of the completed shape. While recent works leveraging local symmetry transformations via direct regression have significantly improved the preservation of geometric structure details, these methods suffer from two major limitations: (1) These regression-based methods are prone to overfitting which tend to memorize instant-specific transformations instead of learning a generalizable geometric prior. (2) Their reliance on point-wise transformation regression lead to high sensitivity to input noise, severely degrading their robustness and generalization. To address these challenges, we introduce Simba, a novel framework that reformulates point-wise transformation regression as a distribution learning problem. Our approach integrates symmetry priors with the powerful generative capabilities of diffusion models, avoiding instance-specific memorization while capturing robust geometric structures. Additionally, we introduce a hierarchical Mamba-based architecture to achieve high-fidelity upsampling. Extensive experiments across the PCN, ShapeNet, and KITTI benchmarks validate our method's state-of-the-art (SOTA) performance.

Paper Structure

This paper contains 28 sections, 13 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Strong cross-domain generalizability on KITTI. Our model, trained on synthetic data, is competitive without finetuning and achieves superior performance with it.
  • Figure 2: Stage 1: SymmGT pre-training architecture. The network regresses transformation field $\boldsymbol{\mathcal{T}}_{gt}$ from partial input and complete GT point clouds.
  • Figure 3: Stage 2: The Simba coarse-to-fine architecture. Our framework comprises two core components: a Symmetry-Diffusion Module (Sym-Diffuser) that generates a field of geometric transformations to produce a structurally-complete coarse shape from partial input, and a novel MBA-Refiner decoder that progressively refines this coarse representation to yield the final high-fidelity output point cloud. FE denotes Feature Extractor.
  • Figure 4: Detailed architecture of the Mamba Fusion and MambaForward module. It employs a Mamba Fusion block to coherently merge multi-source features and point coordinates. A core MambaForward, integrated within a feed-forward network, then progressively refines and upsamples the geometry to produce a high-fidelity output.
  • Figure 5: Qualitative comparison on the PCN dataset. Our Simba achieves superior geometric consistency and preserves fine details where other methods struggle.
  • ...and 6 more figures