Table of Contents
Fetching ...

DGS-SLAM: Gaussian Splatting SLAM in Dynamic Environment

Mangyu Kong, Jaewon Lee, Seongwon Lee, Euntai Kim

TL;DR

This work introduces Dynamic Gaussian Splatting SLAM (DGS-SLAM), the first dynamic SLAM framework built on the foundation of Gaussian Splatting, and introduces a robust mask generation method that enforces photometric consistency across keyframes, reducing noise from inaccurate segmentation and artifacts such as shadows.

Abstract

We introduce Dynamic Gaussian Splatting SLAM (DGS-SLAM), the first dynamic SLAM framework built on the foundation of Gaussian Splatting. While recent advancements in dense SLAM have leveraged Gaussian Splatting to enhance scene representation, most approaches assume a static environment, making them vulnerable to photometric and geometric inconsistencies caused by dynamic objects. To address these challenges, we integrate Gaussian Splatting SLAM with a robust filtering process to handle dynamic objects throughout the entire pipeline, including Gaussian insertion and keyframe selection. Within this framework, to further improve the accuracy of dynamic object removal, we introduce a robust mask generation method that enforces photometric consistency across keyframes, reducing noise from inaccurate segmentation and artifacts such as shadows. Additionally, we propose the loop-aware window selection mechanism, which utilizes unique keyframe IDs of 3D Gaussians to detect loops between the current and past frames, facilitating joint optimization of the current camera poses and the Gaussian map. DGS-SLAM achieves state-of-the-art performance in both camera tracking and novel view synthesis on various dynamic SLAM benchmarks, proving its effectiveness in handling real-world dynamic scenes.

DGS-SLAM: Gaussian Splatting SLAM in Dynamic Environment

TL;DR

This work introduces Dynamic Gaussian Splatting SLAM (DGS-SLAM), the first dynamic SLAM framework built on the foundation of Gaussian Splatting, and introduces a robust mask generation method that enforces photometric consistency across keyframes, reducing noise from inaccurate segmentation and artifacts such as shadows.

Abstract

We introduce Dynamic Gaussian Splatting SLAM (DGS-SLAM), the first dynamic SLAM framework built on the foundation of Gaussian Splatting. While recent advancements in dense SLAM have leveraged Gaussian Splatting to enhance scene representation, most approaches assume a static environment, making them vulnerable to photometric and geometric inconsistencies caused by dynamic objects. To address these challenges, we integrate Gaussian Splatting SLAM with a robust filtering process to handle dynamic objects throughout the entire pipeline, including Gaussian insertion and keyframe selection. Within this framework, to further improve the accuracy of dynamic object removal, we introduce a robust mask generation method that enforces photometric consistency across keyframes, reducing noise from inaccurate segmentation and artifacts such as shadows. Additionally, we propose the loop-aware window selection mechanism, which utilizes unique keyframe IDs of 3D Gaussians to detect loops between the current and past frames, facilitating joint optimization of the current camera poses and the Gaussian map. DGS-SLAM achieves state-of-the-art performance in both camera tracking and novel view synthesis on various dynamic SLAM benchmarks, proving its effectiveness in handling real-world dynamic scenes.

Paper Structure

This paper contains 15 sections, 12 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Result of our framework. Top: RGB-D frames as input. Center: Reconstructed Gaussian map model without dynamics. Bottom: Rendered images from tracked camera pose.
  • Figure 2: Framework Overview Our framework simultaneously estimates the camera pose while reconstructing a 3D gaussian splatting map with a sequence of RGB-D frames in a dynamic environment. DGS-SLAM consists of three main components: initialization, frontend tracking, and backend mapping. During initialization, the Gaussians are optimized based on the first frame. In the frontend, DGS-SLAM estimates the camera pose while filtering out dynamic elements. The backend then performs joint optimization to refine the pose and update the 3D Gaussian Splatting map.
  • Figure 3: Comparison of rendered results from state-of-the-art Gaussian Splatting SLAM approaches based on the estimated input frame poses.
  • Figure 4: Visualization of robust mask generation. From right to left: the input image, rendered image, robust mask, and full mask. In the full mask, blue represents the semantic segmentation mask, and red indicates the robust mask.