Temporal Smoothness-Aware Rate-Distortion Optimized 4D Gaussian Splatting
Hyeongmin Lee, Kyungjune Baek
TL;DR
The paper tackles the heavy storage burden of dynamic 4D Gaussian Splatting (4DGS) by introducing an end-to-end rate-distortion (RD) optimized compression framework that builds on the Ex4DGS baseline. It leverages a Haar wavelet transform to compress dynamic point trajectories and employs mask-based parameter pruning along with entropy-constrained vector quantization, integrated into a unified RD objective: $\mathcal{L}_{\text{total}} = \mathcal{L}_{\text{dist}} + \lambda_{\text{R}} \mathcal{L}_{\text{rate}} + \lambda_{\text{reg}} \mathcal{L}_{\text{reg}}$, where $\mathcal{L}_{\text{rate}} = \lambda_{\text{GSprune}}\mathcal{L}_{\text{GSprune}} + \lambda_{\text{SHprune}}\mathcal{L}_{\text{SHprune}} + \mathcal{L}_{\text{entropy}} + \mathcal{L}_{\text{VQ}}$. The approach yields significant compression (up to 91× in some cases) while maintaining reasonable rendering fidelity and enabling flexible rate–distortion trade-offs suitable for edge devices and high-performance systems. Empirical results on N3V and Technicolor demonstrate substantial RD improvements, with ablative analyses guiding parameter choices (e.g., avoiding aggressive variance quantization) and comparisons showing favorable speed and size versus concurrent methods like Light4GS. The work advances practical volumetric video deployment by making dynamic Gaussian representations more compact and transfer-friendly, paving the way for real-time rendering on diverse hardware. Extensions to improve high-fidelity performance and further compress dynamic components are identified as future directions.
Abstract
Dynamic 4D Gaussian Splatting (4DGS) effectively extends the high-speed rendering capabilities of 3D Gaussian Splatting (3DGS) to represent volumetric videos. However, the large number of Gaussians, substantial temporal redundancies, and especially the absence of an entropy-aware compression framework result in large storage requirements. Consequently, this poses significant challenges for practical deployment, efficient edge-device processing, and data transmission. In this paper, we introduce a novel end-to-end RD-optimized compression framework tailored for 4DGS, aiming to enable flexible, high-fidelity rendering across varied computational platforms. Leveraging Fully Explicit Dynamic Gaussian Splatting (Ex4DGS), one of the state-of-the-art 4DGS methods, as our baseline, we start from the existing 3DGS compression methods for compatibility while effectively addressing additional challenges introduced by the temporal axis. In particular, instead of storing motion trajectories independently per point, we employ a wavelet transform to reflect the real-world smoothness prior, significantly enhancing storage efficiency. This approach yields significantly improved compression ratios and provides a user-controlled balance between compression efficiency and rendering quality. Extensive experiments demonstrate the effectiveness of our method, achieving up to 91$\times$ compression compared to the original Ex4DGS model while maintaining high visual fidelity. These results highlight the applicability of our framework for real-time dynamic scene rendering in diverse scenarios, from resource-constrained edge devices to high-performance environments. The source code is available at https://github.com/HyeongminLEE/RD4DGS.
