Modeling Ambient Scene Dynamics for Free-view Synthesis
Meng-Li Shih, Jia-Bin Huang, Changil Kim, Rajvi Shah, Johannes Kopf, Chen Gao
TL;DR
This work tackles dynamic ambient free-view synthesis from monocular captures by extending 3D Gaussian Splatting to model time-varying scene components. It introduces per-Gaussian motion trajectories encoded with a DCT-based basis, predicted by an MLP, and stabilized by rigidity and depth regularization to handle unbounded scenes. The method employs a three-stage pipeline with depth-guided static reconstruction and memory-efficient, multi-pass rendering to render high-fidelity novel views and unseen motions, validated on a new plant/ambient-dynamics forest dataset. The contributions include a monocular dynamic ambient dataset, motion-editing capabilities, and substantial improvements over baselines in both qualitative and quantitative metrics, enabling realistic immersive viewpoints for unbounded outdoor scenes.
Abstract
We introduce a novel method for dynamic free-view synthesis of an ambient scenes from a monocular capture bringing a immersive quality to the viewing experience. Our method builds upon the recent advancements in 3D Gaussian Splatting (3DGS) that can faithfully reconstruct complex static scenes. Previous attempts to extend 3DGS to represent dynamics have been confined to bounded scenes or require multi-camera captures, and often fail to generalize to unseen motions, limiting their practical application. Our approach overcomes these constraints by leveraging the periodicity of ambient motions to learn the motion trajectory model, coupled with careful regularization. We also propose important practical strategies to improve the visual quality of the baseline 3DGS static reconstructions and to improve memory efficiency critical for GPU-memory intensive learning. We demonstrate high-quality photorealistic novel view synthesis of several ambient natural scenes with intricate textures and fine structural elements.
