Table of Contents
Fetching ...

Neural Wavelet-domain Diffusion for 3D Shape Generation

Ka-Hei Hui, Ruihui Li, Jingyu Hu, Chi-Wing Fu

TL;DR

<3-5 sentence high-level summary> The paper tackles the challenge of directly generating high-fidelity 3D shapes by operating on a continuous implicit surface representation in the wavelet frequency domain. It introduces a compact representation built from a truncated TSDF decomposed into coarse and detail coefficient volumes, and trains a diffusion-based generator for the coarse coefficients plus a detail predictor for fine details, enabling rich shapes with complex topology. Through ShapeNet-based experiments, the approach achieves superior qualitative and quantitative performance compared with state-of-the-art methods, highlighting improvements in surface cleanliness, detail, and diversity. This wavelet-domain diffusion framework opens avenues for efficient implicit-surface generation and potential extensions to conditioning, editing, and animation.

Abstract

This paper presents a new approach for 3D shape generation, enabling direct generative modeling on a continuous implicit representation in wavelet domain. Specifically, we propose a compact wavelet representation with a pair of coarse and detail coefficient volumes to implicitly represent 3D shapes via truncated signed distance functions and multi-scale biorthogonal wavelets, and formulate a pair of neural networks: a generator based on the diffusion model to produce diverse shapes in the form of coarse coefficient volumes; and a detail predictor to further produce compatible detail coefficient volumes for enriching the generated shapes with fine structures and details. Both quantitative and qualitative experimental results manifest the superiority of our approach in generating diverse and high-quality shapes with complex topology and structures, clean surfaces, and fine details, exceeding the 3D generation capabilities of the state-of-the-art models.

Neural Wavelet-domain Diffusion for 3D Shape Generation

TL;DR

<3-5 sentence high-level summary> The paper tackles the challenge of directly generating high-fidelity 3D shapes by operating on a continuous implicit surface representation in the wavelet frequency domain. It introduces a compact representation built from a truncated TSDF decomposed into coarse and detail coefficient volumes, and trains a diffusion-based generator for the coarse coefficients plus a detail predictor for fine details, enabling rich shapes with complex topology. Through ShapeNet-based experiments, the approach achieves superior qualitative and quantitative performance compared with state-of-the-art methods, highlighting improvements in surface cleanliness, detail, and diversity. This wavelet-domain diffusion framework opens avenues for efficient implicit-surface generation and potential extensions to conditioning, editing, and animation.

Abstract

This paper presents a new approach for 3D shape generation, enabling direct generative modeling on a continuous implicit representation in wavelet domain. Specifically, we propose a compact wavelet representation with a pair of coarse and detail coefficient volumes to implicitly represent 3D shapes via truncated signed distance functions and multi-scale biorthogonal wavelets, and formulate a pair of neural networks: a generator based on the diffusion model to produce diverse shapes in the form of coarse coefficient volumes; and a detail predictor to further produce compatible detail coefficient volumes for enriching the generated shapes with fine structures and details. Both quantitative and qualitative experimental results manifest the superiority of our approach in generating diverse and high-quality shapes with complex topology and structures, clean surfaces, and fine details, exceeding the 3D generation capabilities of the state-of-the-art models.
Paper Structure (27 sections, 1 equation, 5 figures, 2 tables)

This paper contains 27 sections, 1 equation, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overview of our approach. (a) Data preparation builds a compact wavelet representation (a pair of coarse and detail coefficient volumes) for each input shape using a truncated signed distance field (TSDF) and a multi-scale wavelet decomposition. (b) Shape learning trains the generator network to produce coarse coefficient volumes from random noise samples and trains the detail predictor network to produce detail coefficient volumes from coarse coefficient volumes. (c) Shape generation employs the trained generator to produce a coarse coefficient volume and then the trained detail predictor to further predict a compatible detail coefficient volume, followed by an inverse wavelet transform and marching cube, to generate the output 3D shape.
  • Figure 2: Reconstructions with different wavelet filters. (a) An input shape from ShapeNet. (b,c) Reconstructions from the $J$=3 coefficient volumes with biorthogonal wavelets. The two numbers mean the vanishing moment of the synthesis and analysis wavelets. (d) Reconstruction with the Haar wavelet.
  • Figure 3: Gallery of our generated shapes: Table, Chair, Cabinet, and Airplane (top to bottom). Our shapes exhibit complex structures, fine details, and clean surfaces, without obvious artifacts, compared with those generated by others; see Figure \ref{['fig:query_comapre']}.
  • Figure 4: Visual comparisons with state-of-the-art methods. Our generated shapes exhibit finer details and cleaner surfaces, without obvious artifacts.
  • Figure 5: Shape novelty analysis. Top: From our generated shape (in green), we retrieve top-four most similar shapes (in blue) in training set by CD and LFD. Bottom: We generate 500 chairs using our method; for each chair, we retrieve the most similar shape in the training set by LFD; then, we plot the distribution of LFDs for all retrievals, showing that our method is able to generate shapes that are more similar (low LFDs) or more novel (high LFDs) compared to the training set. Note that the generated shape at $50^{\text{th}}$ percentile is already not that similar to the associated training-set shape.