Table of Contents
Fetching ...

GSplatLoc: Ultra-Precise Camera Localization via 3D Gaussian Splatting

Atticus J. Zeller, Haijuan Wu

TL;DR

GSplatLoc tackles ultra-precise camera localization by estimating the $6$-DoF pose $(oldsymbol{R}, oldsymbol{t}) \in \,SE(3)$ of a query depth image within a pre-built 3D Gaussian Splatting scene. It casts pose estimation as gradient-based optimization over a differentiable depth renderer derived from Gaussian splats, minimizing the discrepancy between the rendered depth and observed depth using a depth plus contour loss. Key contributions include a GPU-accelerated, differentiable framework with theoretical analysis of camera pose derivatives in Gaussian Splatting, a robust depth-only objective, and extensive evaluations showing sub-millimeter translational errors on synthetic data and strong real-world performance on Replica and TUM RGB-D, surpassing ICP/GICP-based baselines in many cases. This approach enables real-time, high-precision localization suitable for dense mapping, robotics, and AR, while outlining directions to extend to full SLAM with loop closure and dynamic scenes.

Abstract

We present GSplatLoc, a camera localization method that leverages the differentiable rendering capabilities of 3D Gaussian splatting for ultra-precise pose estimation. By formulating pose estimation as a gradient-based optimization problem that minimizes discrepancies between rendered depth maps from a pre-existing 3D Gaussian scene and observed depth images, GSplatLoc achieves translational errors within 0.01 cm and near-zero rotational errors on the Replica dataset - significantly outperforming existing methods. Evaluations on the Replica and TUM RGB-D datasets demonstrate the method's robustness in challenging indoor environments with complex camera motions. GSplatLoc sets a new benchmark for localization in dense mapping, with important implications for applications requiring accurate real-time localization, such as robotics and augmented reality.

GSplatLoc: Ultra-Precise Camera Localization via 3D Gaussian Splatting

TL;DR

GSplatLoc tackles ultra-precise camera localization by estimating the -DoF pose of a query depth image within a pre-built 3D Gaussian Splatting scene. It casts pose estimation as gradient-based optimization over a differentiable depth renderer derived from Gaussian splats, minimizing the discrepancy between the rendered depth and observed depth using a depth plus contour loss. Key contributions include a GPU-accelerated, differentiable framework with theoretical analysis of camera pose derivatives in Gaussian Splatting, a robust depth-only objective, and extensive evaluations showing sub-millimeter translational errors on synthetic data and strong real-world performance on Replica and TUM RGB-D, surpassing ICP/GICP-based baselines in many cases. This approach enables real-time, high-precision localization suitable for dense mapping, robotics, and AR, while outlining directions to extend to full SLAM with loop closure and dynamic scenes.

Abstract

We present GSplatLoc, a camera localization method that leverages the differentiable rendering capabilities of 3D Gaussian splatting for ultra-precise pose estimation. By formulating pose estimation as a gradient-based optimization problem that minimizes discrepancies between rendered depth maps from a pre-existing 3D Gaussian scene and observed depth images, GSplatLoc achieves translational errors within 0.01 cm and near-zero rotational errors on the Replica dataset - significantly outperforming existing methods. Evaluations on the Replica and TUM RGB-D datasets demonstrate the method's robustness in challenging indoor environments with complex camera motions. GSplatLoc sets a new benchmark for localization in dense mapping, with important implications for applications requiring accurate real-time localization, such as robotics and augmented reality.
Paper Structure (16 sections, 14 equations, 1 figure, 4 tables)

This paper contains 16 sections, 14 equations, 1 figure, 4 tables.

Figures (1)

  • Figure 1: We propose GSplatLoc, a novel camera localization method that leverages the differentiable rendering capabilities of 3D Gaussian splatting for efficient and accurate pose estimation.