Table of Contents
Fetching ...

Bayesian Diffusion Models for 3D Shape Reconstruction

Haiyang Xu, Yu Lei, Zeyuan Chen, Xiang Zhang, Yue Zhao, Yilin Wang, Zhuowen Tu

TL;DR

BDM introduces a diffusion-based Bayesian framework that tightly couples bottom-up data-driven inference with top-down priors for single-view 3D shape reconstruction. It formalizes a stochastic Langevin-like update and uses denoising diffusion models to fuse prior and data information, offering two fusion variants: BDM-B (blending) and BDM-M (merging). Empirically, BDM achieves state-of-the-art results on ShapeNet-R2N2 and Pix3D across data regimes, with comprehensive ablations demonstrating robust performance gains, efficient inference, and favorable human evaluations. The approach enables effective use of standalone 3D priors to regularize reconstruction, addressing data scarcity and improving geometric fidelity in challenging real-world scenarios. This diffusion-based Bayesian fusion has broad implications for principled, interpretable integration of priors in computer vision tasks beyond 3D reconstruction.

Abstract

We present Bayesian Diffusion Models (BDM), a prediction algorithm that performs effective Bayesian inference by tightly coupling the top-down (prior) information with the bottom-up (data-driven) procedure via joint diffusion processes. We show the effectiveness of BDM on the 3D shape reconstruction task. Compared to prototypical deep learning data-driven approaches trained on paired (supervised) data-labels (e.g. image-point clouds) datasets, our BDM brings in rich prior information from standalone labels (e.g. point clouds) to improve the bottom-up 3D reconstruction. As opposed to the standard Bayesian frameworks where explicit prior and likelihood are required for the inference, BDM performs seamless information fusion via coupled diffusion processes with learned gradient computation networks. The specialty of our BDM lies in its capability to engage the active and effective information exchange and fusion of the top-down and bottom-up processes where each itself is a diffusion process. We demonstrate state-of-the-art results on both synthetic and real-world benchmarks for 3D shape reconstruction.

Bayesian Diffusion Models for 3D Shape Reconstruction

TL;DR

BDM introduces a diffusion-based Bayesian framework that tightly couples bottom-up data-driven inference with top-down priors for single-view 3D shape reconstruction. It formalizes a stochastic Langevin-like update and uses denoising diffusion models to fuse prior and data information, offering two fusion variants: BDM-B (blending) and BDM-M (merging). Empirically, BDM achieves state-of-the-art results on ShapeNet-R2N2 and Pix3D across data regimes, with comprehensive ablations demonstrating robust performance gains, efficient inference, and favorable human evaluations. The approach enables effective use of standalone 3D priors to regularize reconstruction, addressing data scarcity and improving geometric fidelity in challenging real-world scenarios. This diffusion-based Bayesian fusion has broad implications for principled, interpretable integration of priors in computer vision tasks beyond 3D reconstruction.

Abstract

We present Bayesian Diffusion Models (BDM), a prediction algorithm that performs effective Bayesian inference by tightly coupling the top-down (prior) information with the bottom-up (data-driven) procedure via joint diffusion processes. We show the effectiveness of BDM on the 3D shape reconstruction task. Compared to prototypical deep learning data-driven approaches trained on paired (supervised) data-labels (e.g. image-point clouds) datasets, our BDM brings in rich prior information from standalone labels (e.g. point clouds) to improve the bottom-up 3D reconstruction. As opposed to the standard Bayesian frameworks where explicit prior and likelihood are required for the inference, BDM performs seamless information fusion via coupled diffusion processes with learned gradient computation networks. The specialty of our BDM lies in its capability to engage the active and effective information exchange and fusion of the top-down and bottom-up processes where each itself is a diffusion process. We demonstrate state-of-the-art results on both synthetic and real-world benchmarks for 3D shape reconstruction.
Paper Structure (22 sections, 8 equations, 10 figures, 11 tables)

This paper contains 22 sections, 8 equations, 10 figures, 11 tables.

Figures (10)

  • Figure 1: Baseline vs Bayesian Diffusion Models. Our BDM brings rich prior knowledge into the shape reconstruction process, fixing the incorrect predictions by the baseline (top row). BDM surpasses baselines in all three training data scales (bottom row).
  • Figure 2: Overview of the generative process in our Bayesian Diffusion Model. In each Bayesian denoising dtep, the prior diffusion model fuses with the reconstruction process, bringing rich prior knowledge and improving the quality of the reconstructed point cloud. We illustrate our Bayesian denoising step in two ways, left in the form of a flowchart and right in the form of point clouds.
  • Figure 3: Illustration for the Bayesian Diffusion Models compared with the standard Bayesian formulation. We present the standard Bayesian formulation and the one using stochastic gradient Langevin on the top part, while our proposed BDM on the bottom.
  • Figure 4: Illustration of our proposed fusion methods: BDM-M and BDM-B. The left part is the BDM-M, while the right side shows the BDM-B.
  • Figure 5: Qualitative comparisons on the synthetic ShapeNet-R2N2 dataset. We use PC$^{2}$melas2023pc2 and CCD-3DR di2023ccd as baselines of 3D shape reconstruction. Rows 1-3 show the visualization of PC$^{2}$ while rows 4-6 display the result of CCD-3DR. We show our BDM's results under 10% data in column 2-4 and the results under 50% in column 5-7. Column 8 gives the corresponding ground truth.
  • ...and 5 more figures