LTGS: Long-Term Gaussian Scene Chronology From Sparse View Updates
Minkwan Kim, Seungmin Lee, Junho Kim, Young Min Kim
TL;DR
LTGS addresses long-term scene evolution under sparse captures by updating an initial Gaussian splatting reconstruction with object-level change templates. It combines change detection from semantic and photometric cues, object-template extraction, and per-object pose-aware optimization to fuse time-varying objects with a static background. Experimental results show superior reconstruction quality and efficient updates against NeRF and Gaussian-splat baselines on both synthetic and real-world datasets, especially for abrupt object insertions, removals, or relocations. This object-centric approach promises scalable, reusable priors for digital twins, robotics, and location-based services, with future work extending to non-rigid changes and lighting variations.
Abstract
Recent advances in novel-view synthesis can create the photo-realistic visualization of real-world environments from conventional camera captures. However, acquiring everyday environments from casual captures faces challenges due to frequent scene changes, which require dense observations both spatially and temporally. We propose long-term Gaussian scene chronology from sparse-view updates, coined LTGS, an efficient scene representation that can embrace everyday changes from highly under-constrained casual captures. Given an incomplete and unstructured Gaussian splatting representation obtained from an initial set of input images, we robustly model the long-term chronology of the scene despite abrupt movements and subtle environmental variations. We construct objects as template Gaussians, which serve as structural, reusable priors for shared object tracks. Then, the object templates undergo a further refinement pipeline that modulates the priors to adapt to temporally varying environments based on few-shot observations. Once trained, our framework is generalizable across multiple time steps through simple transformations, significantly enhancing the scalability for a temporal evolution of 3D environments. As existing datasets do not explicitly represent the long-term real-world changes with a sparse capture setup, we collect real-world datasets to evaluate the practicality of our pipeline. Experiments demonstrate that our framework achieves superior reconstruction quality compared to other baselines while enabling fast and light-weight updates.
