Table of Contents
Fetching ...

VBGS-SLAM: Variational Bayesian Gaussian Splatting Simultaneous Localization and Mapping

Yuhan Zhu, Yanyu Zhang, Jie Xu, Wei Ren

Abstract

3D Gaussian Splatting (3DGS) has shown promising results for 3D scene modeling using mixtures of Gaussians, yet its existing simultaneous localization and mapping (SLAM) variants typically rely on direct, deterministic pose optimization against the splat map, making them sensitive to initialization and susceptible to catastrophic forgetting as map evolves. We propose Variational Bayesian Gaussian Splatting SLAM (VBGS-SLAM), a novel framework that couples the splat map refinement and camera pose tracking in a generative probabilistic form. By leveraging conjugate properties of multivariate Gaussians and variational inference, our method admits efficient closed-form updates and explicitly maintains posterior uncertainty over both poses and scene parameters. This uncertainty-aware method mitigates drift and enhances robustness in challenging conditions, while preserving the efficiency and rendering quality of existing 3DGS. Our experiments demonstrate superior tracking performance and robustness in long sequence prediction, alongside efficient, high-quality novel view synthesis across diverse synthetic and real-world scenes.

VBGS-SLAM: Variational Bayesian Gaussian Splatting Simultaneous Localization and Mapping

Abstract

3D Gaussian Splatting (3DGS) has shown promising results for 3D scene modeling using mixtures of Gaussians, yet its existing simultaneous localization and mapping (SLAM) variants typically rely on direct, deterministic pose optimization against the splat map, making them sensitive to initialization and susceptible to catastrophic forgetting as map evolves. We propose Variational Bayesian Gaussian Splatting SLAM (VBGS-SLAM), a novel framework that couples the splat map refinement and camera pose tracking in a generative probabilistic form. By leveraging conjugate properties of multivariate Gaussians and variational inference, our method admits efficient closed-form updates and explicitly maintains posterior uncertainty over both poses and scene parameters. This uncertainty-aware method mitigates drift and enhances robustness in challenging conditions, while preserving the efficiency and rendering quality of existing 3DGS. Our experiments demonstrate superior tracking performance and robustness in long sequence prediction, alongside efficient, high-quality novel view synthesis across diverse synthetic and real-world scenes.

Paper Structure

This paper contains 24 sections, 14 equations, 2 figures, 8 tables.

Figures (2)

  • Figure C1: VBGS-SLAM System Overview: Our method takes RGB-D images as input and initializes a point cloud via back-projection. This point cloud seeds a probabilistic 3D Gaussian map parameterized by spatial and color distribution under a generative prior model. The pipeline can be interpreted as a closed-form variational inference framework that jointly optimizes the Gaussian map and estimates the camera pose through a unified objective in Eq. \ref{['eq:vbgs_slam_kld_fixed']}. The online processing integrates new RGB-D image pairs, where keyframe management governs Gaussian adaption, keyframe selection, and co-visibility checking. This unified strategy enables efficient real-time SLAM by coupling mapping and tracking within a shared closed-form update loop.
  • Figure E1: Qualitative Comparison of Rendering image to ground truth. The top row displays results on AR-TABLE dataset, while the second row showcases the render results on TUM-RGBD dataset. Columns correspond to rendering image of ground-truth, MonoGS, SplaTAM, and VBGS-SLAM (ours), respectively.