Table of Contents
Fetching ...

3DMambaIPF: A State Space Model for Iterative Point Cloud Filtering via Differentiable Rendering

Qingyuan Zhou, Weidong Yang, Ben Fei, Jingyi Xu, Rui Zhang, Keyi Liu, Yeqi Luo, Ying He

TL;DR

Noise is pervasive in real-world point clouds and poses a scalability challenge for learning-based filters on dense, large-scale data. 3DMambaIPF introduces a Mamba-based state-space backbone with patch-wise processing and a differentiable point-rendering loss, guided by an adaptive ground-truth across iterations, to achieve high-fidelity denoising and edge preservation. The method attains state-of-the-art results on PU-Net and demonstrates strong scalability to around 500K points, supported by comprehensive ablations showing the contributions of the rendering loss and Mamba depth. Overall, the approach provides a practical and efficient solution for large-scale point cloud denoising with improved boundary quality and visual realism.

Abstract

Noise is an inevitable aspect of point cloud acquisition, necessitating filtering as a fundamental task within the realm of 3D vision. Existing learning-based filtering methods have shown promising capabilities on small-scale synthetic or real-world datasets. Nonetheless, the effectiveness of these methods is constrained when dealing with a substantial quantity of point clouds. This limitation primarily stems from their limited denoising capabilities for large-scale point clouds and their inclination to generate noisy outliers after denoising. The recent introduction of State Space Models (SSMs) for long sequence modeling in Natural Language Processing (NLP) presents a promising solution for handling large-scale data. Encouraged by iterative point cloud filtering methods, we introduce 3DMambaIPF, firstly incorporating Mamba (Selective SSM) architecture to sequentially handle extensive point clouds from large scenes, capitalizing on its strengths in selective input processing and long sequence modeling capabilities. Additionally, we integrate a robust and fast differentiable rendering loss to constrain the noisy points around the surface. In contrast to previous methodologies, this differentiable rendering loss enhances the visual realism of denoised geometric structures and aligns point cloud boundaries more closely with those observed in real-world objects. Extensive evaluation on datasets comprising small-scale synthetic and real-world models (typically with up to 50K points) demonstrate that our method achieves state-of-the-art results. Moreover, we showcase the superior scalability and efficiency of our method on large-scale models with about 500K points, where the majority of the existing learning-based denoising methods are unable to handle.

3DMambaIPF: A State Space Model for Iterative Point Cloud Filtering via Differentiable Rendering

TL;DR

Noise is pervasive in real-world point clouds and poses a scalability challenge for learning-based filters on dense, large-scale data. 3DMambaIPF introduces a Mamba-based state-space backbone with patch-wise processing and a differentiable point-rendering loss, guided by an adaptive ground-truth across iterations, to achieve high-fidelity denoising and edge preservation. The method attains state-of-the-art results on PU-Net and demonstrates strong scalability to around 500K points, supported by comprehensive ablations showing the contributions of the rendering loss and Mamba depth. Overall, the approach provides a practical and efficient solution for large-scale point cloud denoising with improved boundary quality and visual realism.

Abstract

Noise is an inevitable aspect of point cloud acquisition, necessitating filtering as a fundamental task within the realm of 3D vision. Existing learning-based filtering methods have shown promising capabilities on small-scale synthetic or real-world datasets. Nonetheless, the effectiveness of these methods is constrained when dealing with a substantial quantity of point clouds. This limitation primarily stems from their limited denoising capabilities for large-scale point clouds and their inclination to generate noisy outliers after denoising. The recent introduction of State Space Models (SSMs) for long sequence modeling in Natural Language Processing (NLP) presents a promising solution for handling large-scale data. Encouraged by iterative point cloud filtering methods, we introduce 3DMambaIPF, firstly incorporating Mamba (Selective SSM) architecture to sequentially handle extensive point clouds from large scenes, capitalizing on its strengths in selective input processing and long sequence modeling capabilities. Additionally, we integrate a robust and fast differentiable rendering loss to constrain the noisy points around the surface. In contrast to previous methodologies, this differentiable rendering loss enhances the visual realism of denoised geometric structures and aligns point cloud boundaries more closely with those observed in real-world objects. Extensive evaluation on datasets comprising small-scale synthetic and real-world models (typically with up to 50K points) demonstrate that our method achieves state-of-the-art results. Moreover, we showcase the superior scalability and efficiency of our method on large-scale models with about 500K points, where the majority of the existing learning-based denoising methods are unable to handle.
Paper Structure (35 sections, 11 equations, 10 figures, 12 tables, 1 algorithm)

This paper contains 35 sections, 11 equations, 10 figures, 12 tables, 1 algorithm.

Figures (10)

  • Figure 1: Overview of 3DMambaIPF. An Encoder-Decoder-based model built on Mamba named Mamba-Denoising Module is introduced for iterative filtering. The input noisy point cloud is partitioned into patches and fed into the iterative Mamba-Denoising Module. Upon completion of the iterations, clean point cloud patches are produced as output. To enhance the filtering of noisy points around the surface, a differentiable rendering method is introduced. The rendering loss and reconstruction loss are jointly backpropagated to update the parameters of 3DMambaIPF.
  • Figure 2: A Gaussian noise with a variable standard deviation is added to the GT to generate the adaptive GT. As the standard deviation gradually decreases, the adaptive GT approaches the GT with each iteration. Eventually, the GT no longer changes in the final iteration.
  • Figure 3: Mamba-Denoising Module for iterative point cloud filtering of 3DMambaIPF. The Mamba-based Encoder comprises Dynamic EdgeConv modules and Mamba Blocks, while the Mamba-based Decoder consists of Mamba Decoder Blocks and an Activated Linear module.
  • Figure 4: Detailed network architectures for Dynamic EdgeConv, Mamba Block, and Mamba Decoder Block.
  • Figure 5: Visualization Comparisons on Stanford 3D Scanning Repository with 5$\textbf{\%}$ Gaussian deviation. Coloration of each point is determined by its point-wise P2F distance, with points exhibiting low P2F values (clean) depicted in green, while those with high P2F values (noisy) are portrayed in purple.
  • ...and 5 more figures