Table of Contents
Fetching ...

WaveletGaussian: Wavelet-domain Diffusion for Sparse-view 3D Gaussian Object Reconstruction

Hung Nguyen, Runfa Li, An Le, Truong Nguyen

TL;DR

Sparse-view 3D Gaussian Splatting struggles with geometry fidelity and expensive diffusion-based repairs. The proposed WaveletGaussian shifts diffusion to the low-frequency LL subband in the wavelet domain and uses a lightweight HF refinement network, augmented by Online Random Masking to efficiently curate diffusion data, achieving substantial reductions in training time while maintaining rendering quality. The approach demonstrates competitive results on Mip-NeRF 360 and OmniObject3D, with about 40% faster training and 0.3–0.5 dB PSNR gains over baselines, validating the benefits of frequency-separated diffusion and efficient dataset creation. This work enables scalable sparse-view object reconstruction with diffusion-assisted refinement at a fraction of the computational cost of RGB-domain diffusion methods.

Abstract

3D Gaussian Splatting (3DGS) has become a powerful representation for image-based object reconstruction, yet its performance drops sharply in sparse-view settings. Prior works address this limitation by employing diffusion models to repair corrupted renders, subsequently using them as pseudo ground truths for later optimization. While effective, such approaches incur heavy computation from the diffusion fine-tuning and repair steps. We present WaveletGaussian, a framework for more efficient sparse-view 3D Gaussian object reconstruction. Our key idea is to shift diffusion into the wavelet domain: diffusion is applied only to the low-resolution LL subband, while high-frequency subbands are refined with a lightweight network. We further propose an efficient online random masking strategy to curate training pairs for diffusion fine-tuning, replacing the commonly used, but inefficient, leave-one-out strategy. Experiments across two benchmark datasets, Mip-NeRF 360 and OmniObject3D, show WaveletGaussian achieves competitive rendering quality while substantially reducing training time.

WaveletGaussian: Wavelet-domain Diffusion for Sparse-view 3D Gaussian Object Reconstruction

TL;DR

Sparse-view 3D Gaussian Splatting struggles with geometry fidelity and expensive diffusion-based repairs. The proposed WaveletGaussian shifts diffusion to the low-frequency LL subband in the wavelet domain and uses a lightweight HF refinement network, augmented by Online Random Masking to efficiently curate diffusion data, achieving substantial reductions in training time while maintaining rendering quality. The approach demonstrates competitive results on Mip-NeRF 360 and OmniObject3D, with about 40% faster training and 0.3–0.5 dB PSNR gains over baselines, validating the benefits of frequency-separated diffusion and efficient dataset creation. This work enables scalable sparse-view object reconstruction with diffusion-assisted refinement at a fraction of the computational cost of RGB-domain diffusion methods.

Abstract

3D Gaussian Splatting (3DGS) has become a powerful representation for image-based object reconstruction, yet its performance drops sharply in sparse-view settings. Prior works address this limitation by employing diffusion models to repair corrupted renders, subsequently using them as pseudo ground truths for later optimization. While effective, such approaches incur heavy computation from the diffusion fine-tuning and repair steps. We present WaveletGaussian, a framework for more efficient sparse-view 3D Gaussian object reconstruction. Our key idea is to shift diffusion into the wavelet domain: diffusion is applied only to the low-resolution LL subband, while high-frequency subbands are refined with a lightweight network. We further propose an efficient online random masking strategy to curate training pairs for diffusion fine-tuning, replacing the commonly used, but inefficient, leave-one-out strategy. Experiments across two benchmark datasets, Mip-NeRF 360 and OmniObject3D, show WaveletGaussian achieves competitive rendering quality while substantially reducing training time.

Paper Structure

This paper contains 13 sections, 6 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: We propose WaveletGaussian, a framework for sparse-view 3D Gaussian object reconstruction based on wavelet-domain diffusion model repair, which significantly reduces training time while bettering rendering quality.
  • Figure 2: The proposed WaveletGaussian framework for sparse-view 3D Gaussian object reconstruction. Central to WaveletGaussian is repositioning of the diffusion model ControlNet from the RGB to lower-resolution wavelet domain for novel view repairs.
  • Figure 3: 1-level DWT subbands of an image region DWTGS. The LL subband provides a coarse approximation, while the other subbands provide directional frequency information.
  • Figure 4: Pseudo view generation via low-resolution LL-domain diffusion. Given corrupted LL and LH subbands at a) and c), our framework provides the corresponding repairs at b) and d). Through the Inverse DWT step, we generate a sample at e), later used as pseudo reference. This bypasses full-resolution, RGB-domain diffusion at f), while providing comparable results. Subbands upsampled for better visualization.