Optimizing Crowd-Aware Multi-Agent Path Finding through Local Communication with Graph Neural Networks

Phu Pham; Aniket Bera

Optimizing Crowd-Aware Multi-Agent Path Finding through Local Communication with Graph Neural Networks

Phu Pham, Aniket Bera

TL;DR

CRAMP addresses crowded multi-agent path finding by combining crowd-aware reinforcement learning with local communication via Graph Neural Networks and a boosted curriculum to train robust decentralized policies. The approach formulates MAPF as sequential decision making in a partially observable grid world, leveraging A3C with a CNN-LSTM-GNN stack, a demonstration learning component from an expert, and a crowd-aware reward that discourages congestion through a dynamic density threshold. Empirically, CRAMP outperforms state-of-the-art decentralized MAPF methods on metrics including success rate, makespan, total moves, and collision count, and remains robust as map size and density grow; ablation confirms the contributions of the crowd-aware reward and GNN communication. The work provides a scalable, coordination-friendly framework for dense agent crowds with potential applications in autonomous robotics and intelligent transportation.

Abstract

Multi-Agent Path Finding (MAPF) in crowded environments presents a challenging problem in motion planning, aiming to find collision-free paths for all agents in the system. MAPF finds a wide range of applications in various domains, including aerial swarms, autonomous warehouse robotics, and self-driving vehicles. Current approaches to MAPF generally fall into two main categories: centralized and decentralized planning. Centralized planning suffers from the curse of dimensionality when the number of agents or states increases and thus does not scale well in large and complex environments. On the other hand, decentralized planning enables agents to engage in real-time path planning within a partially observable environment, demonstrating implicit coordination. However, they suffer from slow convergence and performance degradation in dense environments. In this paper, we introduce CRAMP, a novel crowd-aware decentralized reinforcement learning approach to address this problem by enabling efficient local communication among agents via Graph Neural Networks (GNNs), facilitating situational awareness and decision-making capabilities in congested environments. We test CRAMP on simulated environments and demonstrate that our method outperforms the state-of-the-art decentralized methods for MAPF on various metrics. CRAMP improves the solution quality up to 59% measured in makespan and collision count, and up to 35% improvement in success rate in comparison to previous methods.

Optimizing Crowd-Aware Multi-Agent Path Finding through Local Communication with Graph Neural Networks

TL;DR

Abstract

Paper Structure (19 sections, 12 equations, 5 figures, 2 tables)

This paper contains 19 sections, 12 equations, 5 figures, 2 tables.

INTRODUCTION
RELATED WORK
Centralized approaches
Decentralized approaches
Multi-agent reinforcement learning
Graph neural networks for MAPF
OUR APPROACH
World Modeling
Reinforcement learning for local policy training
Demonstration learning
Crowd-aware reward function
Graph neural network for local communication
Boosted curriculum learning
Experiments and results
Experiment setup
...and 4 more sections

Figures (5)

Figure 1: Comparison of paths between our CRAMP approach and the previous state-of-the-art methods. Colored circles with numerical labels represent the starting positions of agents, while dashed circles with corresponding numbers denote the target positions. The obstacles are marked with grey squares. Our innovative crowd-aware method for multi-agent path-finding challenges significantly outperforms existing approaches such as PRIMAL PRIMAL and PICO PICO, yielding notably shorter solutions.
Figure 2: The network architecture to train the distributed local policies. The network takes local observation of size $10 \times 10 \times 4$ as inputs. The network's outputs include the predicted probabilities for the actions (size $5 \times 1$), predicted return value, validity of predicted action and blocking probabilities (all of size $1\times1$).
Figure 3: Example of agents getting negative and positive rewards by moving in or out of a $\zeta$-crowded region. The colored circles with numerical labels are the agents and the dashed circles represent the previous positions.
Figure 4: Performance comparison between CRAMP and other methods in larger and denser worlds.
Figure 5: Effect of a crowd-aware reward function and GNN-based local communication.

Optimizing Crowd-Aware Multi-Agent Path Finding through Local Communication with Graph Neural Networks

TL;DR

Abstract

Optimizing Crowd-Aware Multi-Agent Path Finding through Local Communication with Graph Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (5)