A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis
Kai Katsumata, Duc Minh Vo, Hideki Nakayama
TL;DR
This work tackles real-time dynamic view synthesis by extending 3D Gaussian Splatting with a compact, time-parameterized Gaussian representation. It models Gaussian centers with a Fourier basis and rotations with a linear quaternion, while keeping scale, color, and opacity time-invariant, yielding memory efficiency of $O(LN)$ and enabling rendering at 118 FPS at high resolution on a single GPU. A flow-based supervision strategy aligns scene flow to input videos, and a two-stage optimization (static priors followed by dynamic refinement) plus divide-and-prune strategies deliver robust reconstruction. Across D-NeRF, DyNeRF, and HyperNeRF data, the approach achieves competitive visual quality with superior rendering speed and enables easy editing of dynamic scenes due to the explicit Gaussian representation.
Abstract
3D Gaussian Splatting (3DGS) has shown remarkable success in synthesizing novel views given multiple views of a static scene. Yet, 3DGS faces challenges when applied to dynamic scenes because 3D Gaussian parameters need to be updated per timestep, requiring a large amount of memory and at least a dozen observations per timestep. To address these limitations, we present a compact dynamic 3D Gaussian representation that models positions and rotations as functions of time with a few parameter approximations while keeping other properties of 3DGS including scale, color and opacity invariant. Our method can dramatically reduce memory usage and relax a strict multi-view assumption. In our experiments on monocular and multi-view scenarios, we show that our method not only matches state-of-the-art methods, often linked with slower rendering speeds, in terms of high rendering quality but also significantly surpasses them by achieving a rendering speed of $118$ frames per second (FPS) at a resolution of 1,352$\times$1,014 on a single GPU.
