Scaling Team Coordination on Graphs with Reinforcement Learning

Manshi Limbu; Zechen Hu; Xuan Wang; Daigo Shishika; Xuesu Xiao

Scaling Team Coordination on Graphs with Reinforcement Learning

Manshi Limbu, Zechen Hu, Xuan Wang, Daigo Shishika, Xuesu Xiao

TL;DR

This paper investigates how RL can solve the team coordination on graphs with risky edges problem into Markov Decision Processes with a novel state and action space, and shows that RL efficiently solves problems with up to 20/4 or 25/3 nodes/agents, using a fraction of the time needed for JSG to solve such complex problems.

Abstract

This paper studies Reinforcement Learning (RL) techniques to enable team coordination behaviors in graph environments with support actions among teammates to reduce the costs of traversing certain risky edges in a centralized manner. While classical approaches can solve this non-standard multi-agent path planning problem by converting the original Environment Graph (EG) into a Joint State Graph (JSG) to implicitly incorporate the support actions, those methods do not scale well to large graphs and teams. To address this curse of dimensionality, we propose to use RL to enable agents to learn such graph traversal and teammate supporting behaviors in a data-driven manner. Specifically, through a new formulation of the team coordination on graphs with risky edges problem into Markov Decision Processes (MDPs) with a novel state and action space, we investigate how RL can solve it in two paradigms: First, we use RL for a team of agents to learn how to coordinate and reach the goal with minimal cost on a single EG. We show that RL efficiently solves problems with up to 20/4 or 25/3 nodes/agents, using a fraction of the time needed for JSG to solve such complex problems; Second, we learn a general RL policy for any $N$-node EGs to produce efficient supporting behaviors. We present extensive experiments and compare our RL approaches against their classical counterparts.

Scaling Team Coordination on Graphs with Reinforcement Learning

TL;DR

Abstract

-node EGs to produce efficient supporting behaviors. We present extensive experiments and compare our RL approaches against their classical counterparts.

Paper Structure (29 sections, 13 equations, 3 figures, 1 table)

This paper contains 29 sections, 13 equations, 3 figures, 1 table.

INTRODUCTION
RELATED WORK
Multi-Agent Systems
Multi-Agent Reinforcement Learning
PROBLEM FORMULATION
Team Coordination on Graphs
MDP Formulation
State Space
Action Space
Reward Function
State Transition Function
Full MDP
Reinforcement Learning for Single and Multiple eg(s)
Single eg
Multiple egs
...and 14 more sections

Figures (3)

Figure 1: Team coordination with reinforcement learning on a single graph (a) and on multiple graphs (b) with risky edges and supporting behaviors to reduce risk.
Figure 2: Single eg: Optimality vs. Time plots for four agents on sparse, moderate, and dense graphs.
Figure 3: Multiple eg: Optimality vs. Time plots for two agents on any 5- or 10-node graphs.

Scaling Team Coordination on Graphs with Reinforcement Learning

TL;DR

Abstract

Scaling Team Coordination on Graphs with Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)