Table of Contents
Fetching ...

Light4GS: Lightweight Compact 4D Gaussian Splatting Generation via Context Model

Mufan Liu, Qi Yang, He Huang, Wenjie Huang, Zhenlong Yuan, Zhu Li, Yiling Xu

TL;DR

Light4GS addresses the storage and computation challenges of dynamic 4D Gaussian Splatting (4DGS) by introducing spatio-temporal significance pruning (STP) to drop redundant deformable primitives and an entropy-constrained, multiscale hexplane context model (MHCM) to compress latent embeddings. It also incorporates PCA-guided hexplane queries and adaptive quantization, plus entropy-constrained spherical harmonics compression, to achieve large bitrate reductions with minimal quality loss. Across synthetic and real datasets, Light4GS delivers $>\,10\times$ to $>\,200\times$ compression and up to $20\%$ FPS gains over baseline methods, while maintaining competitive PSNR/SSIM/LPIPS. These results demonstrate that storage-efficient dynamic 3DGS is feasible for real-time streaming and practical deployment, setting a strong benchmark for future compression of deformation-based Gaussian representations.

Abstract

3D Gaussian Splatting (3DGS) has emerged as an efficient and high-fidelity paradigm for novel view synthesis. To adapt 3DGS for dynamic content, deformable 3DGS incorporates temporally deformable primitives with learnable latent embeddings to capture complex motions. Despite its impressive performance, the high-dimensional embeddings and vast number of primitives lead to substantial storage requirements. In this paper, we introduce a \textbf{Light}weight \textbf{4}D\textbf{GS} framework, called Light4GS, that employs significance pruning with a deep context model to provide a lightweight storage-efficient dynamic 3DGS representation. The proposed Light4GS is based on 4DGS that is a typical representation of deformable 3DGS. Specifically, our framework is built upon two core components: (1) a spatio-temporal significance pruning strategy that eliminates over 64\% of the deformable primitives, followed by an entropy-constrained spherical harmonics compression applied to the remainder; and (2) a deep context model that integrates intra- and inter-prediction with hyperprior into a coarse-to-fine context structure to enable efficient multiscale latent embedding compression. Our approach achieves over 120x compression and increases rendering FPS up to 20\% compared to the baseline 4DGS, and also superior to frame-wise state-of-the-art 3DGS compression methods, revealing the effectiveness of our Light4GS in terms of both intra- and inter-prediction methods without sacrificing rendering quality.

Light4GS: Lightweight Compact 4D Gaussian Splatting Generation via Context Model

TL;DR

Light4GS addresses the storage and computation challenges of dynamic 4D Gaussian Splatting (4DGS) by introducing spatio-temporal significance pruning (STP) to drop redundant deformable primitives and an entropy-constrained, multiscale hexplane context model (MHCM) to compress latent embeddings. It also incorporates PCA-guided hexplane queries and adaptive quantization, plus entropy-constrained spherical harmonics compression, to achieve large bitrate reductions with minimal quality loss. Across synthetic and real datasets, Light4GS delivers to compression and up to FPS gains over baseline methods, while maintaining competitive PSNR/SSIM/LPIPS. These results demonstrate that storage-efficient dynamic 3DGS is feasible for real-time streaming and practical deployment, setting a strong benchmark for future compression of deformation-based Gaussian representations.

Abstract

3D Gaussian Splatting (3DGS) has emerged as an efficient and high-fidelity paradigm for novel view synthesis. To adapt 3DGS for dynamic content, deformable 3DGS incorporates temporally deformable primitives with learnable latent embeddings to capture complex motions. Despite its impressive performance, the high-dimensional embeddings and vast number of primitives lead to substantial storage requirements. In this paper, we introduce a \textbf{Light}weight \textbf{4}D\textbf{GS} framework, called Light4GS, that employs significance pruning with a deep context model to provide a lightweight storage-efficient dynamic 3DGS representation. The proposed Light4GS is based on 4DGS that is a typical representation of deformable 3DGS. Specifically, our framework is built upon two core components: (1) a spatio-temporal significance pruning strategy that eliminates over 64\% of the deformable primitives, followed by an entropy-constrained spherical harmonics compression applied to the remainder; and (2) a deep context model that integrates intra- and inter-prediction with hyperprior into a coarse-to-fine context structure to enable efficient multiscale latent embedding compression. Our approach achieves over 120x compression and increases rendering FPS up to 20\% compared to the baseline 4DGS, and also superior to frame-wise state-of-the-art 3DGS compression methods, revealing the effectiveness of our Light4GS in terms of both intra- and inter-prediction methods without sacrificing rendering quality.

Paper Structure

This paper contains 22 sections, 13 equations, 17 figures, 6 tables.

Figures (17)

  • Figure 1: Motivation of 4DGS compression. Deformable primitives and hexplanes account for more than 99% of 4DGS's storage (top-right). More than 60% of primitives show almost 0 contribution to the final rendering and the hexplane exhibits strong inter-correlation (top-left). Considering these, we introduce STP to prune insignificant primitives and employ MHCM to compress hexplanes, with the three key technical components (bottom).
  • Figure 2: Structure of Light4GS.Left: Illustration of 4DGS. Middle: Storage reduction is achieved through STP and MHCM. Deformable primitives are pruned based on their spatio-temporal significance, while MHCM compresses multiscale hexplanes: the lowest-scale uses a checkerboard model with hyperprior or inter-plane context, and high-scales use inter-scale context. SH coefficients of remained primitives are further compressed via entropy-constrained coding. Right: Entropy-constrained codec pipeline, where deep context or hyperprior infers distribution prediction to encode/decode items $\boldsymbol{f}$ (multiscale hexplanes or SH). SH coding estimates distribution without any context.
  • Figure 3: Gaussian splatting is unbounded, leading to queries outside the hexplane. Misaligned hexplane query directions with principal directions further reduce the amount of information the hexplane can learn (left); Nonlinear contraction constrains all points within the hexplane bounds, while PCA aligns query directions with principal directions (right).
  • Figure 5: Qualitative quality comparisons of Standup in the Neu3D dataset. Light4GS achieves up to 65-272× compression with a maximum 1% PSNR degradation.
  • Figure 6: Bitrate allocation for Trex trained at two scales (low scale on the left, high scale on the right). Top row: space-only planes; Bottom row: space-time planes. Bitrate map is normalized.
  • ...and 12 more figures