Table of Contents
Fetching ...

4DGS-CC: A Contextual Coding Framework for 4D Gaussian Splatting Data Compression

Zicong Chen, Zhenghao Chen, Wei Jiang, Wei Wang, Lei Liu, Dong Xu

TL;DR

This work tackles the high storage costs of dynamic scene representation with 4D Gaussian Splatting by introducing 4DGS-CC, a contextual coding framework that splits 4DGS into a canonical 3D Gaussian component and a 4D deformation field (neural voxels). It then applies two dedicated neural-contextual coding streams: VQCC for the spherical-harmonic coded canonical Gaussians and NVCC for the deformable 4D neural voxels, leveraging hyperpriors and spatiotemporal priors to enable lossless compression within a lossy–lossless hybrid scheme. The method achieves substantial multi-rate storage reductions (averaging around $12\times$) across three benchmarks (D-NeRF, Neu3D, HyperNeRF) while maintaining rendering fidelity, with ablations confirming the complementary contributions of CG/NV components and both NVCC and VQCC. Overall, 4DGS-CC bridges 4D Gaussian Splatting and neural data compression, enabling scalable, rate-adaptive dynamic scene representations suitable for real-time or bandwidth-constrained applications.

Abstract

Storage is a significant challenge in reconstructing dynamic scenes with 4D Gaussian Splatting (4DGS) data. In this work, we introduce 4DGS-CC, a contextual coding framework that compresses 4DGS data to meet specific storage constraints. Building upon the established deformable 3D Gaussian Splatting (3DGS) method, our approach decomposes 4DGS data into 4D neural voxels and a canonical 3DGS component, which are then compressed using Neural Voxel Contextual Coding (NVCC) and Vector Quantization Contextual Coding (VQCC), respectively. Specifically, we first decompose the 4D neural voxels into distinct quantized features by separating the temporal and spatial dimensions. To losslessly compress each quantized feature, we leverage the previously compressed features from the temporal and spatial dimensions as priors and apply NVCC to generate the spatiotemporal context for contextual coding. Next, we employ a codebook to store spherical harmonics information from canonical 3DGS as quantized vectors, which are then losslessly compressed by using VQCC with the auxiliary learned hyperpriors for contextual coding, thereby reducing redundancy within the codebook. By integrating NVCC and VQCC, our contextual coding framework, 4DGS-CC, enables multi-rate 4DGS data compression tailored to specific storage requirements. Extensive experiments on three 4DGS data compression benchmarks demonstrate that our method achieves an average storage reduction of approximately 12 times while maintaining rendering fidelity compared to our baseline 4DGS approach.

4DGS-CC: A Contextual Coding Framework for 4D Gaussian Splatting Data Compression

TL;DR

This work tackles the high storage costs of dynamic scene representation with 4D Gaussian Splatting by introducing 4DGS-CC, a contextual coding framework that splits 4DGS into a canonical 3D Gaussian component and a 4D deformation field (neural voxels). It then applies two dedicated neural-contextual coding streams: VQCC for the spherical-harmonic coded canonical Gaussians and NVCC for the deformable 4D neural voxels, leveraging hyperpriors and spatiotemporal priors to enable lossless compression within a lossy–lossless hybrid scheme. The method achieves substantial multi-rate storage reductions (averaging around ) across three benchmarks (D-NeRF, Neu3D, HyperNeRF) while maintaining rendering fidelity, with ablations confirming the complementary contributions of CG/NV components and both NVCC and VQCC. Overall, 4DGS-CC bridges 4D Gaussian Splatting and neural data compression, enabling scalable, rate-adaptive dynamic scene representations suitable for real-time or bandwidth-constrained applications.

Abstract

Storage is a significant challenge in reconstructing dynamic scenes with 4D Gaussian Splatting (4DGS) data. In this work, we introduce 4DGS-CC, a contextual coding framework that compresses 4DGS data to meet specific storage constraints. Building upon the established deformable 3D Gaussian Splatting (3DGS) method, our approach decomposes 4DGS data into 4D neural voxels and a canonical 3DGS component, which are then compressed using Neural Voxel Contextual Coding (NVCC) and Vector Quantization Contextual Coding (VQCC), respectively. Specifically, we first decompose the 4D neural voxels into distinct quantized features by separating the temporal and spatial dimensions. To losslessly compress each quantized feature, we leverage the previously compressed features from the temporal and spatial dimensions as priors and apply NVCC to generate the spatiotemporal context for contextual coding. Next, we employ a codebook to store spherical harmonics information from canonical 3DGS as quantized vectors, which are then losslessly compressed by using VQCC with the auxiliary learned hyperpriors for contextual coding, thereby reducing redundancy within the codebook. By integrating NVCC and VQCC, our contextual coding framework, 4DGS-CC, enables multi-rate 4DGS data compression tailored to specific storage requirements. Extensive experiments on three 4DGS data compression benchmarks demonstrate that our method achieves an average storage reduction of approximately 12 times while maintaining rendering fidelity compared to our baseline 4DGS approach.

Paper Structure

This paper contains 19 sections, 2 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: The overview of our 4DGS data compression framework. It consists of six main components, which are decomposition, compression of Canonical Gaussians, compression of 4D neural voxels, lossless decoding, reconstruction and rendering.
  • Figure 2: Overview of Vector Quantization Contextual Coding.
  • Figure 3: Overview of Neural Voxels Contextual Coding.
  • Figure 4: Visualization comparison between our baseline and our methods on the "Flame-steak" scene from the Neu3D dataset. PSNR, LPIPS and the size (MB) of the scene are reported.
  • Figure 5: Ablation study of two compression components on the HyperNeRF dataset. (1) 4DGS: Our baseline 4DGS. (2) Ours (4DGS) w/o CG: Our method based on 4DGS without the compression of Canonical Gaussian. (3) Ours (4DGS) w/o NV: Our method based on 4DGS without the compression of Neural Voxel. (4) Ours (4DGS): Our method based on 4DGS.
  • ...and 4 more figures