Table of Contents
Fetching ...

MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM

Vladimir Yugay, Theo Gevers, Martin R. Oswald

TL;DR

This work proposes new tracking and map-merging mechanisms and integrate loop closure in the Gaussian-based SLAM pipeline and evaluates MAGiC-SLAM on synthetic and real-world datasets and finds it more accurate and faster than the state of the art.

Abstract

Simultaneous localization and mapping (SLAM) systems with novel view synthesis capabilities are widely used in computer vision, with applications in augmented reality, robotics, and autonomous driving. However, existing approaches are limited to single-agent operation. Recent work has addressed this problem using a distributed neural scene representation. Unfortunately, existing methods are slow, cannot accurately render real-world data, are restricted to two agents, and have limited tracking accuracy. In contrast, we propose a rigidly deformable 3D Gaussian-based scene representation that dramatically speeds up the system. However, improving tracking accuracy and reconstructing a globally consistent map from multiple agents remains challenging due to trajectory drift and discrepancies across agents' observations. Therefore, we propose new tracking and map-merging mechanisms and integrate loop closure in the Gaussian-based SLAM pipeline. We evaluate MAGiC-SLAM on synthetic and real-world datasets and find it more accurate and faster than the state of the art.

MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM

TL;DR

This work proposes new tracking and map-merging mechanisms and integrate loop closure in the Gaussian-based SLAM pipeline and evaluates MAGiC-SLAM on synthetic and real-world datasets and finds it more accurate and faster than the state of the art.

Abstract

Simultaneous localization and mapping (SLAM) systems with novel view synthesis capabilities are widely used in computer vision, with applications in augmented reality, robotics, and autonomous driving. However, existing approaches are limited to single-agent operation. Recent work has addressed this problem using a distributed neural scene representation. Unfortunately, existing methods are slow, cannot accurately render real-world data, are restricted to two agents, and have limited tracking accuracy. In contrast, we propose a rigidly deformable 3D Gaussian-based scene representation that dramatically speeds up the system. However, improving tracking accuracy and reconstructing a globally consistent map from multiple agents remains challenging due to trajectory drift and discrepancies across agents' observations. Therefore, we propose new tracking and map-merging mechanisms and integrate loop closure in the Gaussian-based SLAM pipeline. We evaluate MAGiC-SLAM on synthetic and real-world datasets and find it more accurate and faster than the state of the art.

Paper Structure

This paper contains 18 sections, 13 equations, 4 figures, 13 tables.

Figures (4)

  • Figure 1: MAGiC-SLAM is a multi-agent SLAM method capable of novel view synthesis. Given single-camera RGBD input streams from multiple simultaneously operating agents MAGiC-SLAM estimates their trajectories and reconstructs a 3D Gaussian map that can be rendered from previously unseen viewpoints. We showcase the high-fidelity 3D Gaussian map of a real-world environment alongside multiple agent trajectories (depicted in green, yellow, and blue) within it. Our method effectively utilizes information from multiple agents to achieve centimeter-level tracking accuracy. Our mapping and map merging strategies allow for realistic rendering of color and depth, significantly improving the state of the art. Unlike previous methods, MAGiC-SLAM is flexible in the number of agents it can handle.
  • Figure 2: MAGiC-SLAM Architecture.Agent Side: Each agent processes a separate RGBD stream, maintaining a local sub-map and estimating its trajectory. When an agent starts a new sub-map, it sends the previous sub-map and image features to the centralized server. Server Side: The server stores the image features and sub-maps from all agents and performs loop closure detection, loop constraint estimation, and pose graph optimization. It then updates the stored sub-maps and returns the optimized poses to the agents. Once the algorithm completes (denoted by green arrows), the server merges the accumulated sub-maps into a single unified map and refines it.
  • Figure 3: Map Merging. Our coarse-to-fine strategy effectively removes (a) visual artifacts caused by the GS mechanism and (c) geometric artifacts resulting from Gaussian sub-map intersections.
  • Figure 4: Rendering performance on ReplicaMultiagent hu2023cpslamcollaborativeneuralpointbased. Thanks to GS scene representation and effective merging strategy, MAGiC-SLAM encodes more high-frequency details and substantially increases the quality of the renderings.