Table of Contents
Fetching ...

Graph Neural Network-based Multi-agent Reinforcement Learning for Resilient Distributed Coordination of Multi-Robot Systems

Anthony Goeckner, Yueyuan Sui, Nicolas Martinet, Xinliang Li, Qi Zhu

TL;DR

This work presents a graph neural network (GNN)-based multi-agent reinforcement learning (MARL) method for resilient distributed coordination of a multi-robot system that outperforms existing methods in several experiments involving agent attrition and communication disturbance, and provides competitive results in scenarios without such anomalies.

Abstract

Existing multi-agent coordination techniques are often fragile and vulnerable to anomalies such as agent attrition and communication disturbances, which are quite common in the real-world deployment of systems like field robotics. To better prepare these systems for the real world, we present a graph neural network (GNN)-based multi-agent reinforcement learning (MARL) method for resilient distributed coordination of a multi-robot system. Our method, Multi-Agent Graph Embedding-based Coordination (MAGEC), is trained using multi-agent proximal policy optimization (PPO) and enables distributed coordination around global objectives under agent attrition, partial observability, and limited or disturbed communications. We use a multi-robot patrolling scenario to demonstrate our MAGEC method in a ROS 2-based simulator and then compare its performance with prior coordination approaches. Results demonstrate that MAGEC outperforms existing methods in several experiments involving agent attrition and communication disturbance, and provides competitive results in scenarios without such anomalies.

Graph Neural Network-based Multi-agent Reinforcement Learning for Resilient Distributed Coordination of Multi-Robot Systems

TL;DR

This work presents a graph neural network (GNN)-based multi-agent reinforcement learning (MARL) method for resilient distributed coordination of a multi-robot system that outperforms existing methods in several experiments involving agent attrition and communication disturbance, and provides competitive results in scenarios without such anomalies.

Abstract

Existing multi-agent coordination techniques are often fragile and vulnerable to anomalies such as agent attrition and communication disturbances, which are quite common in the real-world deployment of systems like field robotics. To better prepare these systems for the real world, we present a graph neural network (GNN)-based multi-agent reinforcement learning (MARL) method for resilient distributed coordination of a multi-robot system. Our method, Multi-Agent Graph Embedding-based Coordination (MAGEC), is trained using multi-agent proximal policy optimization (PPO) and enables distributed coordination around global objectives under agent attrition, partial observability, and limited or disturbed communications. We use a multi-robot patrolling scenario to demonstrate our MAGEC method in a ROS 2-based simulator and then compare its performance with prior coordination approaches. Results demonstrate that MAGEC outperforms existing methods in several experiments involving agent attrition and communication disturbance, and provides competitive results in scenarios without such anomalies.
Paper Structure (28 sections, 7 equations, 8 figures, 2 algorithms)

This paper contains 28 sections, 7 equations, 8 figures, 2 algorithms.

Figures (8)

  • Figure 1: The overall MAGEC training architecture is seen above. Note that the critic is only used during training (CTDE). Please see \ref{['fig:gnn']} for details of the GNN block.
  • Figure 2: An example of neighbor indexing. Note that $v_3$ is neighbor $2$ of $v_0$, but $v_0$ is neighbor $1$ of $v_3$. Neighbor indexing is enforced by the environment observation mechanism.
  • Figure 3: Illustration of GNN computing node embeddings through iterative neighborhood aggregation. A scoring function is applied to the embeddings for decision-making. Note that node and edge features are concatenated during message-passing.
  • Figure 4: The agents were trained on the "Milwaukee" graph and successfully patrol on an entirely different one, "Cumberland" (above), demonstrating the generalizability of our approach. Above, red dots indicate nodes and green lines indicate edges in the patrol graph, while black lines indicate obstacles. Training is performed without obstacles, but in simulation, agents must avoid the walls.
  • Figure 5: Graphs showing the average episode reward and evaluated average idleness over training period. Note that training completes in a mere total 350 thousand environment steps.
  • ...and 3 more figures