Table of Contents
Fetching ...

Learnable Multi-level Discrete Wavelet Transforms for 3D Gaussian Splatting Frequency Modulation

Hung Nguyen, An Le, Truong Nguyen

TL;DR

A multi-level DWT-based frequency modulation framework for 3DGS is proposed, and it is shown that the modulation can be performed using only a single scaling parameter, rather than learning the full 2-tap high-pass filter.

Abstract

3D Gaussian Splatting (3DGS) has emerged as a powerful approach for novel view synthesis. However, the number of Gaussian primitives often grows substantially during training as finer scene details are reconstructed, leading to increased memory and storage costs. Recent coarse-to-fine strategies regulate Gaussian growth by modulating the frequency content of the ground-truth images. In particular, AutoOpti3DGS employs the learnable Discrete Wavelet Transform (DWT) to enable data-adaptive frequency modulation. Nevertheless, its modulation depth is limited by the 1-level DWT, and jointly optimizing wavelet regularization with 3D reconstruction introduces gradient competition that promotes excessive Gaussian densification. In this paper, we propose a multi-level DWT-based frequency modulation framework for 3DGS. By recursively decomposing the low-frequency subband, we construct a deeper curriculum that provides progressively coarser supervision during early training, consistently reducing Gaussian counts. Furthermore, we show that the modulation can be performed using only a single scaling parameter, rather than learning the full 2-tap high-pass filter. Experimental results on standard benchmarks demonstrate that our method further reduces Gaussian counts while maintaining competitive rendering quality.

Learnable Multi-level Discrete Wavelet Transforms for 3D Gaussian Splatting Frequency Modulation

TL;DR

A multi-level DWT-based frequency modulation framework for 3DGS is proposed, and it is shown that the modulation can be performed using only a single scaling parameter, rather than learning the full 2-tap high-pass filter.

Abstract

3D Gaussian Splatting (3DGS) has emerged as a powerful approach for novel view synthesis. However, the number of Gaussian primitives often grows substantially during training as finer scene details are reconstructed, leading to increased memory and storage costs. Recent coarse-to-fine strategies regulate Gaussian growth by modulating the frequency content of the ground-truth images. In particular, AutoOpti3DGS employs the learnable Discrete Wavelet Transform (DWT) to enable data-adaptive frequency modulation. Nevertheless, its modulation depth is limited by the 1-level DWT, and jointly optimizing wavelet regularization with 3D reconstruction introduces gradient competition that promotes excessive Gaussian densification. In this paper, we propose a multi-level DWT-based frequency modulation framework for 3DGS. By recursively decomposing the low-frequency subband, we construct a deeper curriculum that provides progressively coarser supervision during early training, consistently reducing Gaussian counts. Furthermore, we show that the modulation can be performed using only a single scaling parameter, rather than learning the full 2-tap high-pass filter. Experimental results on standard benchmarks demonstrate that our method further reduces Gaussian counts while maintaining competitive rendering quality.
Paper Structure (11 sections, 10 equations, 3 figures, 2 tables)

This paper contains 11 sections, 10 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Illustration of the 2D DWT operations. (a) Original image. (b) 1-level DWT subbands. (c) 2-level DWT obtained by further decomposing the 1-level LL subband. (d, e) Enlarged 1- and 2-level LL subbands, respectively. The higher-level LL subbands are coarser. (f) Reconstructed image using PR-satisfying wavelets.
  • Figure 2: Overview of our framework. The multi-level DWT is employed as a differentiable image modulator. We freeze the original Haar filters and introduce a scaling parameter $\alpha$ on the high-pass analysis filters. When $\alpha=0$, all HF subbands vanish, yielding a coarse IDWT reconstruction, to be used as ground-truths for early-stage 3DGS. A PR-enforcing loss regularizes $\alpha$, progressively restoring high frequencies for automatic coarse-to-fine modulation.
  • Figure 3: Ablation results on DWT levels and scaling parameter effects (3-view LLFF LLFF dataset).