Table of Contents
Fetching ...

Cooperative Path Planning with Asynchronous Multiagent Reinforcement Learning

Jiaming Yin, Weixiong Rao, Yu Xiao, Keshuang Tang

TL;DR

This paper studies the shortest path problem (SPP) with multiple source-destination pairs, namely MSD-SPP, to minimize the average travel time of all routing paths and proposes a two-stage framework of inter-region and intra-region route planning by dividing an entire road network into multiple sub-graph regions.

Abstract

In this paper, we study the shortest path problem (SPP) with multiple source-destination pairs (MSD), namely MSD-SPP, to minimize average travel time of all shortest paths. The inherent traffic capacity limits within a road network contributes to the competition among vehicles. Multi-agent reinforcement learning (MARL) model cannot offer effective and efficient path planning cooperation due to the asynchronous decision making setting in MSD-SPP, where vehicles (a.k.a agents) cannot simultaneously complete routing actions in the previous time step. To tackle the efficiency issue, we propose to divide an entire road network into multiple sub-graphs and subsequently execute a two-stage process of inter-region and intra-region route planning. To address the asynchronous issue, in the proposed asyn-MARL framework, we first design a global state, which exploits a low-dimensional vector to implicitly represent the joint observations and actions of multi-agents. Then we develop a novel trajectory collection mechanism to decrease the redundancy in training trajectories. Additionally, we design a novel actor network to facilitate the cooperation among vehicles towards the same or close destinations and a reachability graph aimed at preventing infinite loops in routing paths. On both synthetic and real road networks, our evaluation result demonstrates that our approach outperforms state-of-the-art planning approaches.

Cooperative Path Planning with Asynchronous Multiagent Reinforcement Learning

TL;DR

This paper studies the shortest path problem (SPP) with multiple source-destination pairs, namely MSD-SPP, to minimize the average travel time of all routing paths and proposes a two-stage framework of inter-region and intra-region route planning by dividing an entire road network into multiple sub-graph regions.

Abstract

In this paper, we study the shortest path problem (SPP) with multiple source-destination pairs (MSD), namely MSD-SPP, to minimize average travel time of all shortest paths. The inherent traffic capacity limits within a road network contributes to the competition among vehicles. Multi-agent reinforcement learning (MARL) model cannot offer effective and efficient path planning cooperation due to the asynchronous decision making setting in MSD-SPP, where vehicles (a.k.a agents) cannot simultaneously complete routing actions in the previous time step. To tackle the efficiency issue, we propose to divide an entire road network into multiple sub-graphs and subsequently execute a two-stage process of inter-region and intra-region route planning. To address the asynchronous issue, in the proposed asyn-MARL framework, we first design a global state, which exploits a low-dimensional vector to implicitly represent the joint observations and actions of multi-agents. Then we develop a novel trajectory collection mechanism to decrease the redundancy in training trajectories. Additionally, we design a novel actor network to facilitate the cooperation among vehicles towards the same or close destinations and a reachability graph aimed at preventing infinite loops in routing paths. On both synthetic and real road networks, our evaluation result demonstrates that our approach outperforms state-of-the-art planning approaches.
Paper Structure (24 sections, 9 equations, 10 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 9 equations, 10 figures, 4 tables, 1 algorithm.

Figures (10)

  • Figure 1: Illustrative comparison among (a) Entire Path-based methods, (b) Next-hop node-based methods, and (c) our method. Each node represents an intersection and each edge represents a road segment.
  • Figure 2: Pipeline of the proposed method
  • Figure 3: Example for asynchronous trajectory collection
  • Figure 4: Architecture of the actor
  • Figure 5: An example of reachability graph construction
  • ...and 5 more figures

Theorems & Definitions (2)

  • Definition 1
  • Definition 2