3R-GS: Best Practice in Optimizing Camera Poses Along with 3DGS
Zhisheng Huang, Peng Wang, Jingdong Zhang, Yuan Liu, Xin Li, Wenping Wang
TL;DR
The paper tackles the problem of reliably reconstructing high-quality 3D Gaussian representations when camera poses come from imperfect MASt3R-SfM outputs. It introduces 3R-GS, a joint optimization framework that combines 3DGS-MCMC for robust Gaussian refinement, an MLP-based global pose refiner to model inter-camera pose correlations, and a rendering-free epipolar distance loss to provide direct geometric supervision. The final objective blends the original 3DGS rendering loss with a geo-regularization term and a geometry-based regularizer, with the geo weight decaying over time to stabilize training. Empirical results on Tanks and Temples, Mip-NeRF360, and DTU show state-of-the-art novel view synthesis and improved camera pose registration, outperforming several baselines including 3DGS, Spann3R, ZeroGS, and CF-3DGS. The approach is computationally efficient on common GPUs and offers a path toward robust 3R-based scene reconstruction in challenging, imperfect-settings, with potential extensions to dynamic or real-time scenarios.
Abstract
3D Gaussian Splatting (3DGS) has revolutionized neural rendering with its efficiency and quality, but like many novel view synthesis methods, it heavily depends on accurate camera poses from Structure-from-Motion (SfM) systems. Although recent SfM pipelines have made impressive progress, questions remain about how to further improve both their robust performance in challenging conditions (e.g., textureless scenes) and the precision of camera parameter estimation simultaneously. We present 3R-GS, a 3D Gaussian Splatting framework that bridges this gap by jointly optimizing 3D Gaussians and camera parameters from large reconstruction priors MASt3R-SfM. We note that naively performing joint 3D Gaussian and camera optimization faces two challenges: the sensitivity to the quality of SfM initialization, and its limited capacity for global optimization, leading to suboptimal reconstruction results. Our 3R-GS, overcomes these issues by incorporating optimized practices, enabling robust scene reconstruction even with imperfect camera registration. Extensive experiments demonstrate that 3R-GS delivers high-quality novel view synthesis and precise camera pose estimation while remaining computationally efficient. Project page: https://zsh523.github.io/3R-GS/
