$D^2$SLAM: Decentralized and Distributed Collaborative Visual-inertial SLAM System for Aerial Swarm

Hao Xu; Peize Liu; Xinyi Chen; Shaojie Shen

$D^2$SLAM: Decentralized and Distributed Collaborative Visual-inertial SLAM System for Aerial Swarm

Hao Xu, Peize Liu, Xinyi Chen, Shaojie Shen

TL;DR

This work tackles the dual challenge of achieving high-precision relative localization for nearby UAVs and maintaining globally consistent trajectories as robots drift apart in aerial swarms. It introduces $D^2$SLAM, a fully decentralized and distributed CSLAM framework that combines near-field ego-motion/relative-state estimation via $D^2$VINS (ADMM-based distributed VIO with manifold optimization) and far-field global trajectory optimization via $D^2$PGO (ARock-based asynchronous distributed PGO). Key contributions include the design of a flexible front-end, a mode-based communication protocol with map-merging, and robust back-ends that handle network latency and asynchronous updates, demonstrated through extensive simulations and real-world experiments with multi-UAV swarms. The system achieves centimeter-level relative localization in proximity and maintains global consistency over larger distances, while remaining scalable through controllable front-end and back-end load and resilient to communication delays. This work advances practical, scalable autonomous aerial swarms by providing a tightly integrated, distributed SLAM solution adaptable to various camera configurations and communication constraints.

Abstract

Collaborative simultaneous localization and mapping (CSLAM) is essential for autonomous aerial swarms, laying the foundation for downstream algorithms such as planning and control. To address existing CSLAM systems' limitations in relative localization accuracy, crucial for close-range UAV collaboration, this paper introduces $D^2$SLAM-a novel decentralized and distributed CSLAM system. $D^2$SLAM innovatively manages near-field estimation for precise relative state estimation in proximity and far-field estimation for consistent global trajectories. Its adaptable front-end supports both stereo and omnidirectional cameras, catering to various operational needs and overcoming field-of-view challenges in aerial swarms. Experiments demonstrate $D^2$SLAM's effectiveness in accurate ego-motion estimation, relative localization, and global consistency. Enhanced by distributed optimization algorithms, $D^2$SLAM exhibits remarkable scalability and resilience to network delays, making it well-suited for a wide range of real-world aerial swarm applications. The adaptability and proven performance of $D^2$SLAM represent a significant advancement in autonomous aerial swarm technology.

$D^2$SLAM: Decentralized and Distributed Collaborative Visual-inertial SLAM System for Aerial Swarm

TL;DR

SLAM, a fully decentralized and distributed CSLAM framework that combines near-field ego-motion/relative-state estimation via

VINS (ADMM-based distributed VIO with manifold optimization) and far-field global trajectory optimization via

PGO (ARock-based asynchronous distributed PGO). Key contributions include the design of a flexible front-end, a mode-based communication protocol with map-merging, and robust back-ends that handle network latency and asynchronous updates, demonstrated through extensive simulations and real-world experiments with multi-UAV swarms. The system achieves centimeter-level relative localization in proximity and maintains global consistency over larger distances, while remaining scalable through controllable front-end and back-end load and resilient to communication delays. This work advances practical, scalable autonomous aerial swarms by providing a tightly integrated, distributed SLAM solution adaptable to various camera configurations and communication constraints.

Abstract

SLAM-a novel decentralized and distributed CSLAM system.

SLAM innovatively manages near-field estimation for precise relative state estimation in proximity and far-field estimation for consistent global trajectories. Its adaptable front-end supports both stereo and omnidirectional cameras, catering to various operational needs and overcoming field-of-view challenges in aerial swarms. Experiments demonstrate

SLAM's effectiveness in accurate ego-motion estimation, relative localization, and global consistency. Enhanced by distributed optimization algorithms,

SLAM exhibits remarkable scalability and resilience to network delays, making it well-suited for a wide range of real-world aerial swarm applications. The adaptability and proven performance of

SLAM represent a significant advancement in autonomous aerial swarm technology.

Paper Structure (56 sections, 22 equations, 15 figures, 9 tables, 5 algorithms)

This paper contains 56 sections, 22 equations, 15 figures, 9 tables, 5 algorithms.

Introduction
Related Works
Distributed SLAM Techniques
Distributed pose graph optimization
Distributed bundle adjustment
Current CSLAM Systems
Preliminary
State Estimation Problem of CSLAM on Aerial Swarm
Decentralized vs Distributed
System Overview
$D^2$SLAM System Architecture
Communication Modes
Multi-Robot Map Merging
Operating Conditions of $D^2$SLAM
Communication
...and 41 more sections

Figures (15)

Figure 1: The demonstration of $D^2$SLAM in the HKUST RI dataset: a) The dense map generated by $D^2$SLAM using TSDF reconstruction, showing only the surface voxels from TSDF. b) The estimated trajectories of $D^2$VINS (for near-field state estimation) and $D^2$PGO (for far-field state estimation) in a three-UAV scenario.
Figure 2: The architecture of $D^2$SLAM. The $D^2$SLAM is independently running on each UAV. The data will be first processed by front-end and then sent to back-end for state estimation. Results can be utilized for dense mapping, planning and control.
Figure 3: Visual-inertial UWB fusion xu2020decentralized and Omni-Swarm xu2022omni are decentralized, and asynchronous Distributed ADMM zhang2014asynchronous is distributed. $D^2$SLAM is both decentralized and distributed.
Figure 4: a) A state machine governing $D^2$SLAM's communication modes begins in discover mode. In this initial phase, each UAV broadcasts its complete keyframe to support rapid initialization and transmits data essential for $D^2$VINS. Once UAVs have successfully initialized their relative states, $D^2$SLAM transitions to either near or far mode, depending on their estimated relative positions. The system can fluidly alternate between near and far modes in response to changes in UAV positioning. Importantly, if a new UAV enters the system, typically seen during the initialization phase, $D^2$SLAM reverts to discover mode to incorporate this new member. b) Map merging occurs during the discover mode, as shown in the flow charts. If near or far-field state estimation identifies other UAVs, the UAV with the higher index transfers its map state to the UAV with the lower index.
Figure 5: The front-end of $D^2$SLAM processes visual data in several stages. Initially, the data undergoes reprojection, followed by extraction of global descriptors and features. Subsequently, it is used for feature tracking, multi-UAV feature matching, and loop closure detection. The final results are then fused in the back-end.
...and 10 more figures

$D^2$SLAM: Decentralized and Distributed Collaborative Visual-inertial SLAM System for Aerial Swarm

TL;DR

Abstract

$D^2$SLAM: Decentralized and Distributed Collaborative Visual-inertial SLAM System for Aerial Swarm

Authors

TL;DR

Abstract

Table of Contents

Figures (15)