Simba: Towards High-Fidelity and Geometrically-Consistent Point Cloud Completion via Transformation Diffusion
Lirui Zhang, Zhengkai Zhao, Zhi Zuo, Pan Gao, Jie Qin
TL;DR
This work targets the challenge of completing partial point clouds while preserving fine input details and global structure. It reframes point-wise transformation regression as a conditional diffusion problem over transformation fields, enabling robust geometric priors to be learned without overfitting. The two-stage framework—Stage1 SymmGT for transforming-field supervision and Stage2 Sym-Diffuser with a cascaded MBA-Refiner—delivers high-fidelity, geometrically consistent completions and demonstrates SOTA performance across PCN, ShapeNet, and KITTI, including strong synthetic-to-real transfer. The approach offers a new direction for diffusion-based shape completion, with potential for improved robustness and generalization in 3D perception tasks.
Abstract
Point cloud completion is a fundamental task in 3D vision. A persistent challenge in this field is simultaneously preserving fine-grained details present in the input while ensuring the global structural integrity of the completed shape. While recent works leveraging local symmetry transformations via direct regression have significantly improved the preservation of geometric structure details, these methods suffer from two major limitations: (1) These regression-based methods are prone to overfitting which tend to memorize instant-specific transformations instead of learning a generalizable geometric prior. (2) Their reliance on point-wise transformation regression lead to high sensitivity to input noise, severely degrading their robustness and generalization. To address these challenges, we introduce Simba, a novel framework that reformulates point-wise transformation regression as a distribution learning problem. Our approach integrates symmetry priors with the powerful generative capabilities of diffusion models, avoiding instance-specific memorization while capturing robust geometric structures. Additionally, we introduce a hierarchical Mamba-based architecture to achieve high-fidelity upsampling. Extensive experiments across the PCN, ShapeNet, and KITTI benchmarks validate our method's state-of-the-art (SOTA) performance.
