Table of Contents
Fetching ...

Gassidy: Gaussian Splatting SLAM in Dynamic Environments

Long Wen, Shixin Li, Yu Zhang, Yuhong Huang, Jianjie Lin, Fengjunjie Pan, Zhenshan Bing, Alois Knoll

TL;DR

An RGB-D dense SLAM which is called Gaussian Splatting SLAM in Dynamic Environments (Gassidy) is developed, which calculates Gaussians to generate rendering loss flows for each environmental component based on a designed photometricgeometric loss function.

Abstract

3D Gaussian Splatting (3DGS) allows flexible adjustments to scene representation, enabling continuous optimization of scene quality during dense visual simultaneous localization and mapping (SLAM) in static environments. However, 3DGS faces challenges in handling environmental disturbances from dynamic objects with irregular movement, leading to degradation in both camera tracking accuracy and map reconstruction quality. To address this challenge, we develop an RGB-D dense SLAM which is called Gaussian Splatting SLAM in Dynamic Environments (Gassidy). This approach calculates Gaussians to generate rendering loss flows for each environmental component based on a designed photometric-geometric loss function. To distinguish and filter environmental disturbances, we iteratively analyze rendering loss flows to detect features characterized by changes in loss values between dynamic objects and static components. This process ensures a clean environment for accurate scene reconstruction. Compared to state-of-the-art SLAM methods, experimental results on open datasets show that Gassidy improves camera tracking precision by up to 97.9% and enhances map quality by up to 6%.

Gassidy: Gaussian Splatting SLAM in Dynamic Environments

TL;DR

An RGB-D dense SLAM which is called Gaussian Splatting SLAM in Dynamic Environments (Gassidy) is developed, which calculates Gaussians to generate rendering loss flows for each environmental component based on a designed photometricgeometric loss function.

Abstract

3D Gaussian Splatting (3DGS) allows flexible adjustments to scene representation, enabling continuous optimization of scene quality during dense visual simultaneous localization and mapping (SLAM) in static environments. However, 3DGS faces challenges in handling environmental disturbances from dynamic objects with irregular movement, leading to degradation in both camera tracking accuracy and map reconstruction quality. To address this challenge, we develop an RGB-D dense SLAM which is called Gaussian Splatting SLAM in Dynamic Environments (Gassidy). This approach calculates Gaussians to generate rendering loss flows for each environmental component based on a designed photometric-geometric loss function. To distinguish and filter environmental disturbances, we iteratively analyze rendering loss flows to detect features characterized by changes in loss values between dynamic objects and static components. This process ensures a clean environment for accurate scene reconstruction. Compared to state-of-the-art SLAM methods, experimental results on open datasets show that Gassidy improves camera tracking precision by up to 97.9% and enhances map quality by up to 6%.

Paper Structure

This paper contains 11 sections, 8 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 2: An example to illustrate the performance of Gassidy when compared with GS-SLAM (here GSS) Matsuki:Murai:etal:CVPR2024 on the TUM RGB-D dataset in the fr3/walk_st scene. The three images in a row represent the rendered depth, the created Gaussians, and the rendered RGB.
  • Figure 3: Architure of Gassidy: $i$ represents the frame number, $C_{i}$ and $D_{i}$ are the RGB image and corresponding aligned depth maps, $O_{i}$ and $B_{i}$ are the RGB-D sets of objects and background, $G_i^{O}$ and $G_i^{B}$ are the Gaussians of objects and background $L_{pho}$ and $L_{geo}$ are the photometric and geometric rendering loss. $R_i$ and $t_i$ are the rotation and translation parts of the camera pose. IoU and OC stand for Intersection over Union and Overlap Coefficient. $G^{O}$ and $G^{O_e}$ denote Gaussians for all objects and static objects, while $G^{B}$ represents background Gaussians. $P$ indicates the probability of an object being dynamic, and $\theta$ is the threshold.
  • Figure 4: Camera tracking trajectories of Gassidy and GS-SLAM in dynamic scenes from TUM dataset. Each method's trajectory, along with the ground truth, is highlighted using different colors, with the moment of significant object movement indicated by an arrow.
  • Figure 5: Large scene reconstruction quality comparison between our method and other 3DGS-based methods in person_track scene from BONN dataset. The red box highlights the flaws in these methods.
  • Figure 6: Map reconstruction quality comparison between our method and other 3DGS-based methods on the fr3/walk_xyz scene from TUM dataset. The image on the left shows the rendered RGB, while the image on the right is the generated Gaussian map.
  • ...and 1 more figures