Table of Contents
Fetching ...

3R-GS: Best Practice in Optimizing Camera Poses Along with 3DGS

Zhisheng Huang, Peng Wang, Jingdong Zhang, Yuan Liu, Xin Li, Wenping Wang

TL;DR

The paper tackles the problem of reliably reconstructing high-quality 3D Gaussian representations when camera poses come from imperfect MASt3R-SfM outputs. It introduces 3R-GS, a joint optimization framework that combines 3DGS-MCMC for robust Gaussian refinement, an MLP-based global pose refiner to model inter-camera pose correlations, and a rendering-free epipolar distance loss to provide direct geometric supervision. The final objective blends the original 3DGS rendering loss with a geo-regularization term and a geometry-based regularizer, with the geo weight decaying over time to stabilize training. Empirical results on Tanks and Temples, Mip-NeRF360, and DTU show state-of-the-art novel view synthesis and improved camera pose registration, outperforming several baselines including 3DGS, Spann3R, ZeroGS, and CF-3DGS. The approach is computationally efficient on common GPUs and offers a path toward robust 3R-based scene reconstruction in challenging, imperfect-settings, with potential extensions to dynamic or real-time scenarios.

Abstract

3D Gaussian Splatting (3DGS) has revolutionized neural rendering with its efficiency and quality, but like many novel view synthesis methods, it heavily depends on accurate camera poses from Structure-from-Motion (SfM) systems. Although recent SfM pipelines have made impressive progress, questions remain about how to further improve both their robust performance in challenging conditions (e.g., textureless scenes) and the precision of camera parameter estimation simultaneously. We present 3R-GS, a 3D Gaussian Splatting framework that bridges this gap by jointly optimizing 3D Gaussians and camera parameters from large reconstruction priors MASt3R-SfM. We note that naively performing joint 3D Gaussian and camera optimization faces two challenges: the sensitivity to the quality of SfM initialization, and its limited capacity for global optimization, leading to suboptimal reconstruction results. Our 3R-GS, overcomes these issues by incorporating optimized practices, enabling robust scene reconstruction even with imperfect camera registration. Extensive experiments demonstrate that 3R-GS delivers high-quality novel view synthesis and precise camera pose estimation while remaining computationally efficient. Project page: https://zsh523.github.io/3R-GS/

3R-GS: Best Practice in Optimizing Camera Poses Along with 3DGS

TL;DR

The paper tackles the problem of reliably reconstructing high-quality 3D Gaussian representations when camera poses come from imperfect MASt3R-SfM outputs. It introduces 3R-GS, a joint optimization framework that combines 3DGS-MCMC for robust Gaussian refinement, an MLP-based global pose refiner to model inter-camera pose correlations, and a rendering-free epipolar distance loss to provide direct geometric supervision. The final objective blends the original 3DGS rendering loss with a geo-regularization term and a geometry-based regularizer, with the geo weight decaying over time to stabilize training. Empirical results on Tanks and Temples, Mip-NeRF360, and DTU show state-of-the-art novel view synthesis and improved camera pose registration, outperforming several baselines including 3DGS, Spann3R, ZeroGS, and CF-3DGS. The approach is computationally efficient on common GPUs and offers a path toward robust 3R-based scene reconstruction in challenging, imperfect-settings, with potential extensions to dynamic or real-time scenarios.

Abstract

3D Gaussian Splatting (3DGS) has revolutionized neural rendering with its efficiency and quality, but like many novel view synthesis methods, it heavily depends on accurate camera poses from Structure-from-Motion (SfM) systems. Although recent SfM pipelines have made impressive progress, questions remain about how to further improve both their robust performance in challenging conditions (e.g., textureless scenes) and the precision of camera parameter estimation simultaneously. We present 3R-GS, a 3D Gaussian Splatting framework that bridges this gap by jointly optimizing 3D Gaussians and camera parameters from large reconstruction priors MASt3R-SfM. We note that naively performing joint 3D Gaussian and camera optimization faces two challenges: the sensitivity to the quality of SfM initialization, and its limited capacity for global optimization, leading to suboptimal reconstruction results. Our 3R-GS, overcomes these issues by incorporating optimized practices, enabling robust scene reconstruction even with imperfect camera registration. Extensive experiments demonstrate that 3R-GS delivers high-quality novel view synthesis and precise camera pose estimation while remaining computationally efficient. Project page: https://zsh523.github.io/3R-GS/

Paper Structure

This paper contains 16 sections, 6 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Overview of the 3R-GS pipeline. The pipeline jointly refines camera poses and 3D Gaussian parameters.
  • Figure 2: Motivations for 3R-GS; see Sec. \ref{['sec:mcmc']}, \ref{['sec:epipolar']}, and \ref{['sec:mlp-refiner']}.
  • Figure 3: Results for novel view synthesis. We omit the failure scenes for CF-3DGS and unreported results for ZeroGS.
  • Figure 4: Visualization of camera pose registration for the three best-performing methods. ZeroGS results are unavailable.