Table of Contents
Fetching ...

SuperPC: A Single Diffusion Model for Point Cloud Completion, Upsampling, Denoising, and Colorization

Yi Du, Zhipeng Zhao, Shaoshu Su, Sharath Golluri, Haoze Zheng, Runmao Yao, Chen Wang

TL;DR

SuperPC introduces a three-level-conditioned diffusion framework with a Spatial-Mix-Fusion strategy to jointly address point cloud completion, upsampling, denoising, and colorization. By fusing image and point-cloud information across raw, local, and global conditioning, it avoids error accumulation inherent in sequential or multi-model pipelines and achieves state-of-the-art performance on object- and scene-level benchmarks. The approach demonstrates strong generalization in object-to-scene and sim-to-real settings, and introduces new benchmarks to evaluate unified PC processing. Overall, SuperPC advances practical 3D scene understanding by delivering a single, efficient model capable of multiple, interrelated PC tasks with high fidelity and color realism.

Abstract

Point cloud (PC) processing tasks-such as completion, upsampling, denoising, and colorization-are crucial in applications like autonomous driving and 3D reconstruction. Despite substantial advancements, prior approaches often address each of these tasks independently, with separate models focused on individual issues. However, this isolated approach fails to account for the fact that defects like incompleteness, low resolution, noise, and lack of color frequently coexist, with each defect influencing and correlating with the others. Simply applying these models sequentially can lead to error accumulation from each model, along with increased computational costs. To address these challenges, we introduce SuperPC, the first unified diffusion model capable of concurrently handling all four tasks. Our approach employs a three-level-conditioned diffusion framework, enhanced by a novel spatial-mix-fusion strategy, to leverage the correlations among these four defects for simultaneous, efficient processing. We show that SuperPC outperforms the state-of-the-art specialized models as well as their combination on all four individual tasks.

SuperPC: A Single Diffusion Model for Point Cloud Completion, Upsampling, Denoising, and Colorization

TL;DR

SuperPC introduces a three-level-conditioned diffusion framework with a Spatial-Mix-Fusion strategy to jointly address point cloud completion, upsampling, denoising, and colorization. By fusing image and point-cloud information across raw, local, and global conditioning, it avoids error accumulation inherent in sequential or multi-model pipelines and achieves state-of-the-art performance on object- and scene-level benchmarks. The approach demonstrates strong generalization in object-to-scene and sim-to-real settings, and introduces new benchmarks to evaluate unified PC processing. Overall, SuperPC advances practical 3D scene understanding by delivering a single, efficient model capable of multiple, interrelated PC tasks with high fidelity and color realism.

Abstract

Point cloud (PC) processing tasks-such as completion, upsampling, denoising, and colorization-are crucial in applications like autonomous driving and 3D reconstruction. Despite substantial advancements, prior approaches often address each of these tasks independently, with separate models focused on individual issues. However, this isolated approach fails to account for the fact that defects like incompleteness, low resolution, noise, and lack of color frequently coexist, with each defect influencing and correlating with the others. Simply applying these models sequentially can lead to error accumulation from each model, along with increased computational costs. To address these challenges, we introduce SuperPC, the first unified diffusion model capable of concurrently handling all four tasks. Our approach employs a three-level-conditioned diffusion framework, enhanced by a novel spatial-mix-fusion strategy, to leverage the correlations among these four defects for simultaneous, efficient processing. We show that SuperPC outperforms the state-of-the-art specialized models as well as their combination on all four individual tasks.

Paper Structure

This paper contains 61 sections, 8 equations, 14 figures, 9 tables.

Figures (14)

  • Figure 1: We propose SuperPC, a novel neural architecture that jointly solves inherent shortcomings in the raw point clouds, including noise, sparsity, incompleteness, and the absence of color. To the best of our knowledge, it is the first single diffusion model that can simultaneously tackle the four major challenges in the field of point cloud processing. Red points denote high noise for visualization.
  • Figure 2: The architecture of the SuperPC model shown above integrates input images and point clouds to establish three-level conditions through innovative raw, local, and global modules. These conditions are seamlessly integrated into each step of the diffusion process, enabling SuperPC to utilize all levels of information from the two input modalities.
  • Figure 2: Generalization ability experiment on the four point cloud processing tasks. (5% data used for fine-tuning.)
  • Figure 3: On the left, the raw module integrates the raw information of the image and point cloud into the target point cloud as the raw-level condition via the image projection and the point interpolation. On the right, the local module encodes the two inputs into feature maps, which are then fused using cross-attention to produce a local fused feature map as the local-level condition. Next, the global module condenses this feature map into a global latent code as the global-level condition.
  • Figure 4: The qualitative results on the point cloud (a) completion, (b) upsampling, and (c) denoising tasks. For each subfigure, from left to right are the results for ShapeNet, TartanAir, and KITTI-360. Larger figures and more qualitative results are presented in Appendix \ref{['sec:more_quantitative']}.
  • ...and 9 more figures