Table of Contents
Fetching ...

Multi-Robot Reliable Navigation in Uncertain Topological Environments with Graph Attention Networks

Zhuoyuan Yu, Hongliang Guo, Albertus Hendrawan Adiwahono, Jianle Chan, Brina Shong Wey Tynn, Chee-Meng Chew, Wei-Yun Yau

TL;DR

This letter reformulates the multi-robot reliable navigation problem in uncertain topological networks into a Partially Observable Markov Decision Process (POMDP) framework and introduces the Dynamic Adaptive Graph Embedding method to capture the evolving nature of the navigation task.

Abstract

This paper studies the multi-robot reliable navigation problem in uncertain topological networks, which aims at maximizing the robot team's on-time arrival probabilities in the face of road network uncertainties. The uncertainty in these networks stems from the unknown edge traversability, which is only revealed to the robot upon its arrival at the edge's starting node. Existing approaches often struggle to adapt to real-time network topology changes, making them unsuitable for varying topological environments. To address the challenge, we reformulate the problem into a Partially Observable Markov Decision Process (POMDP) framework and introduce the Dynamic Adaptive Graph Embedding method to capture the evolving nature of the navigation task. We further enhance each robot's policy learning process by integrating deep reinforcement learning with Graph Attention Networks (GATs), leveraging self-attention to focus on critical graph features. The proposed approach, namely Multi-Agent Routing in Variable Environments with Learning (MARVEL) employs the generalized policy gradient algorithm to optimize the robots' real-time decision-making process iteratively. We compare the performance of MARVEL with state-of-the-art reliable navigation algorithms as well as Canadian traveller problem solutions in a range of canonical transportation networks, demonstrating improved adaptability and performance in uncertain topological networks. Additionally, real-world experiments with two robots navigating within a self-constructed indoor environment with uncertain topological structures demonstrate MARVEL's practicality.

Multi-Robot Reliable Navigation in Uncertain Topological Environments with Graph Attention Networks

TL;DR

This letter reformulates the multi-robot reliable navigation problem in uncertain topological networks into a Partially Observable Markov Decision Process (POMDP) framework and introduces the Dynamic Adaptive Graph Embedding method to capture the evolving nature of the navigation task.

Abstract

This paper studies the multi-robot reliable navigation problem in uncertain topological networks, which aims at maximizing the robot team's on-time arrival probabilities in the face of road network uncertainties. The uncertainty in these networks stems from the unknown edge traversability, which is only revealed to the robot upon its arrival at the edge's starting node. Existing approaches often struggle to adapt to real-time network topology changes, making them unsuitable for varying topological environments. To address the challenge, we reformulate the problem into a Partially Observable Markov Decision Process (POMDP) framework and introduce the Dynamic Adaptive Graph Embedding method to capture the evolving nature of the navigation task. We further enhance each robot's policy learning process by integrating deep reinforcement learning with Graph Attention Networks (GATs), leveraging self-attention to focus on critical graph features. The proposed approach, namely Multi-Agent Routing in Variable Environments with Learning (MARVEL) employs the generalized policy gradient algorithm to optimize the robots' real-time decision-making process iteratively. We compare the performance of MARVEL with state-of-the-art reliable navigation algorithms as well as Canadian traveller problem solutions in a range of canonical transportation networks, demonstrating improved adaptability and performance in uncertain topological networks. Additionally, real-world experiments with two robots navigating within a self-constructed indoor environment with uncertain topological structures demonstrate MARVEL's practicality.

Paper Structure

This paper contains 27 sections, 22 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: The illustrative example: In the topological network, the blue and green markers represent Robot A and Robot B, respectively. The red dashed edge is an uncertain edge labelled with traversal probability. The right table provides the corresponding data for the topological map. Each edge has an average cost and standard deviation; the actual cost for each edge is sampled from a Gaussian distribution with rejection. The objective is to maximize the on-time arrival probability for the entire team, which is computed based on the data from subsequent edges combined with time budget $T_i$. In other algorithms, Robot A always selects 9→10→11→12, and Robot B typically learns a fixed policy to select either 5 or 14 at node 3, regardless of real-time conditions. However, MARVEL guides Robot A to take path 9→10→13→11→12 to test the traversability of edge 13→4, allowing adaptive decision-making for Robot B at node 3 based on the real-time transportation networks.
  • Figure 2: The MARVEL framework outlines a processing pipeline composed of a graph embedder, a graph neural network module, and a policy update module based on the online expert. (i) The feature matrix and adjacency matrix are dynamically updated by the real-time structural and positional information. The corpus is generated for each node based on neighbors and shortest paths to all destination nodes. (ii) Our policy network captures topological features, calculates weights using a self-attention mechanism, and selectively aggregates useful information for decision-making. The bar chart represents the influence weights of subsequent vertices in different scenarios. (iii) The policy is dynamically optimized by the online expert-augmented policy gradient method. The cross-entropy loss is calculated between predicted values and prior solutions, enabling effective adjustment of the attention distribution and policy updates.
  • Figure 3: The SOTA probability performance comparison of different scenarios.
  • Figure 4: The SOTA probability performance comparison with the baseline algorithms on different canonical transportation networks.
  • Figure 5: Constructed physical environments and Rviz Virtual Maps.