Gaussians on their Way: Wasserstein-Constrained 4D Gaussian Splatting with State-Space Modeling
Junli Deng, Yihao Luo
TL;DR
This work tackles dynamic scene rendering with 4D Gaussian Splatting, addressing temporal coherence and motion artifacts. It introduces a State Consistency Filter that fuses neural deformation-predicted observations with prior Gaussian states, and grounds Gaussian dynamics in Wasserstein geometry through Log/Exp maps on the Gaussian manifold, enabling smooth, physically plausible evolution. The approach combines a Kalman-like state update, Wasserstein distance regularization, and a neural deformation field to produce temporally coherent, high-quality renderings, validated on synthetic and real datasets with strong gains in PSNR, SSIM, and perceptual quality while maintaining real-time capabilities. Overall, the work offers a principled framework that unifies optimal transport with state-space estimation to advance dynamic 3D scene representation, with potential impact on real-time rendering, AR/VR, and robotics.
Abstract
Dynamic scene rendering has taken a leap forward with the rise of 4D Gaussian Splatting, but there's still one elusive challenge: how to make 3D Gaussians move through time as naturally as they would in the real world, all while keeping the motion smooth and consistent. In this paper, we unveil a fresh approach that blends state-space modeling with Wasserstein geometry, paving the way for a more fluid and coherent representation of dynamic scenes. We introduce a State Consistency Filter that merges prior predictions with the current observations, enabling Gaussians to stay true to their way over time. We also employ Wasserstein distance regularization to ensure smooth, consistent updates of Gaussian parameters, reducing motion artifacts. Lastly, we leverage Wasserstein geometry to capture both translational motion and shape deformations, creating a more physically plausible model for dynamic scenes. Our approach guides Gaussians along their natural way in the Wasserstein space, achieving smoother, more realistic motion and stronger temporal coherence. Experimental results show significant improvements in rendering quality and efficiency, outperforming current state-of-the-art techniques.
