Table of Contents
Fetching ...

Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM

Phu Pham, Dipam Patel, Damon Conover, Aniket Bera

TL;DR

The results show that Go-Slam effectively bridges the gap between geometric mapping and semantic understanding, supporting real-time scene interaction and object retrieval in open-world environments.

Abstract

We introduce Go-SLAM, a novel framework that utilizes 3D Gaussian Splatting SLAM to reconstruct dynamic environments while embedding object-level information within the scene representations. This framework employs advanced object segmentation techniques, assigning a unique identifier to each Gaussian splat that corresponds to the object it represents. Consequently, our system facilitates open-vocabulary querying, allowing users to locate objects using natural language descriptions. Furthermore, the framework features an optimal path generation module that calculates efficient navigation paths for robots toward queried objects, considering obstacles and environmental uncertainties. Comprehensive evaluations in various scene settings demonstrate the effectiveness of our approach in delivering high-fidelity scene reconstructions, precise object segmentation, flexible object querying, and efficient robot path planning. This work represents an additional step forward in bridging the gap between 3D scene reconstruction, semantic object understanding, and real-time environment interactions.

Go-SLAM: Grounded Object Segmentation and Localization with Gaussian Splatting SLAM

TL;DR

The results show that Go-Slam effectively bridges the gap between geometric mapping and semantic understanding, supporting real-time scene interaction and object retrieval in open-world environments.

Abstract

We introduce Go-SLAM, a novel framework that utilizes 3D Gaussian Splatting SLAM to reconstruct dynamic environments while embedding object-level information within the scene representations. This framework employs advanced object segmentation techniques, assigning a unique identifier to each Gaussian splat that corresponds to the object it represents. Consequently, our system facilitates open-vocabulary querying, allowing users to locate objects using natural language descriptions. Furthermore, the framework features an optimal path generation module that calculates efficient navigation paths for robots toward queried objects, considering obstacles and environmental uncertainties. Comprehensive evaluations in various scene settings demonstrate the effectiveness of our approach in delivering high-fidelity scene reconstructions, precise object segmentation, flexible object querying, and efficient robot path planning. This work represents an additional step forward in bridging the gap between 3D scene reconstruction, semantic object understanding, and real-time environment interactions.
Paper Structure (23 sections, 5 equations, 6 figures, 1 table)

This paper contains 23 sections, 5 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: The entire pipeline of Go-SLAM -- The user queries the 3D reconstructed model in real-time with a specific object in the environment. The Go-SLAM detects the queried object and provides the 3D world coordinates of the goal location. The drone now navigates to the provided co-ordinates using PRM path planning algorithm
  • Figure 2: Overview of our Go-SLAM framework for environment reconstruction and language embedded feature.
  • Figure 3: Performance comparison between different object detectors for grounded object segmentation.
  • Figure 4: Open-vocabulary query pipeline
  • Figure 5: Reconstruction results of Office 2 scene from Replica dataset.
  • ...and 1 more figures