DynSUP: Dynamic Gaussian Splatting from An Unposed Image Pair
Weihang Li, Weirong Chen, Shenhan Qian, Jiajie Chen, Daniel Cremers, Haoang Li
TL;DR
Dynamic Gaussian Splatting from An Unposed Image Pair (DynSUP) enables high-fidelity novel-view synthesis of dynamic scenes from two unposed views. It jointly learns an object-level dense bundle adjustment to recover camera pose and per-object motion, and a differentiable SE(3) field driven Gaussian rendering where each Gaussian undergoes its own SE(3) transformation for fine-grained motion modeling, with test-time pose and per-object ratio alignment. Experiments on KITTI and Kubric show consistent improvements over static-scene and pose-dependent baselines, demonstrating robust dynamic reconstruction under sparse, unposed inputs. This approach broadens the applicability of Gaussian-based neural rendering to challenging dynamic environments with minimal input.
Abstract
Recent advances in 3D Gaussian Splatting have shown promising results. Existing methods typically assume static scenes and/or multiple images with prior poses. Dynamics, sparse views, and unknown poses significantly increase the problem complexity due to insufficient geometric constraints. To overcome this challenge, we propose a method that can use only two images without prior poses to fit Gaussians in dynamic environments. To achieve this, we introduce two technical contributions. First, we propose an object-level two-view bundle adjustment. This strategy decomposes dynamic scenes into piece-wise rigid components, and jointly estimates the camera pose and motions of dynamic objects. Second, we design an SE(3) field-driven Gaussian training method. It enables fine-grained motion modeling through learnable per-Gaussian transformations. Our method leads to high-fidelity novel view synthesis of dynamic scenes while accurately preserving temporal consistency and object motion. Experiments on both synthetic and real-world datasets demonstrate that our method significantly outperforms state-of-the-art approaches designed for the cases of static environments, multiple images, and/or known poses. Our project page is available at https://colin-de.github.io/DynSUP/.
