StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams
Yang LI, Jinglu Wang, Lei Chu, Xiao Li, Shiu-hong Kao, Ying-Cong Chen, Yan Lu
TL;DR
StreamGS tackles online, unposed 3D Gaussian Splatting by progressively constructing a 3DGS from image streams in a fully feed-forward manner. It combines an initial coarse two-view reconstruction with content-adaptive refinement and cross-frame feature aggregation, followed by an adaptive density control that merges matched Gaussians across frames. The approach achieves quality on par with optimization-based methods while being up to 150 times faster and exhibits strong generalizability on out-of-domain scenes. This work enables real-time, pose-free reconstruction and rendering from continuous video streams, advancing interactive 3D vision and AR/VR applications.
Abstract
The advent of 3D Gaussian Splatting (3DGS) has advanced 3D scene reconstruction and novel view synthesis. With the growing interest of interactive applications that need immediate feedback, online 3DGS reconstruction in real-time is in high demand. However, none of existing methods yet meet the demand due to three main challenges: the absence of predetermined camera parameters, the need for generalizable 3DGS optimization, and the necessity of reducing redundancy. We propose StreamGS, an online generalizable 3DGS reconstruction method for unposed image streams, which progressively transform image streams to 3D Gaussian streams by predicting and aggregating per-frame Gaussians. Our method overcomes the limitation of the initial point reconstruction \cite{dust3r} in tackling out-of-domain (OOD) issues by introducing a content adaptive refinement. The refinement enhances cross-frame consistency by establishing reliable pixel correspondences between adjacent frames. Such correspondences further aid in merging redundant Gaussians through cross-frame feature aggregation. The density of Gaussians is thereby reduced, empowering online reconstruction by significantly lowering computational and memory costs. Extensive experiments on diverse datasets have demonstrated that StreamGS achieves quality on par with optimization-based approaches but does so 150 times faster, and exhibits superior generalizability in handling OOD scenes.
