P-4DGS: Predictive 4D Gaussian Splatting with 90$\times$ Compression
Henan Wang, Hanxin Zhu, Xinliang Gong, Tianyu He, Xin Li, Zhibo Chen
TL;DR
P-4DGS tackles the storage bottleneck in 4D Gaussian Splatting by introducing a spatial-temporal prediction framework built on anchor-based covariant prediction and a deformation MLP for temporal dynamics. It further employs adaptive quantization and a context-aware entropy model to jointly optimize rate and distortion, achieving up to $40\times$–$90\times$ compression with minimal quality loss and the fastest rendering speeds among strong baselines. The approach demonstrates state-of-the-art RD performance on both synthetic (D-NeRF) and real-world (NeRF-DS) dynamic scenes, with storage around $1$ MB on average. The work highlights practical impact for scalable dynamic scene reconstruction and real-time rendering, while identifying the deformation MLP’s fixed size as a limitation for ultra-low bitrate regimes and suggesting future enhancements in temporal compression.
Abstract
3D Gaussian Splatting (3DGS) has garnered significant attention due to its superior scene representation fidelity and real-time rendering performance, especially for dynamic 3D scene reconstruction (\textit{i.e.}, 4D reconstruction). However, despite achieving promising results, most existing algorithms overlook the substantial temporal and spatial redundancies inherent in dynamic scenes, leading to prohibitive memory consumption. To address this, we propose P-4DGS, a novel dynamic 3DGS representation for compact 4D scene modeling. Inspired by intra- and inter-frame prediction techniques commonly used in video compression, we first design a 3D anchor point-based spatial-temporal prediction module to fully exploit the spatial-temporal correlations across different 3D Gaussian primitives. Subsequently, we employ an adaptive quantization strategy combined with context-based entropy coding to further reduce the size of the 3D anchor points, thereby achieving enhanced compression efficiency. To evaluate the rate-distortion performance of our proposed P-4DGS in comparison with other dynamic 3DGS representations, we conduct extensive experiments on both synthetic and real-world datasets. Experimental results demonstrate that our approach achieves state-of-the-art reconstruction quality and the fastest rendering speed, with a remarkably low storage footprint (around \textbf{1MB} on average), achieving up to \textbf{40$\times$} and \textbf{90$\times$} compression on synthetic and real-world scenes, respectively.
