Asynchronous Cooperative Multi-Agent Reinforcement Learning with Limited Communication
Sydney Dolan, Siddharth Nayak, Jasmine Jerry Aloor, Hamsa Balakrishnan
TL;DR
AsynCoMARL tackles coordination in partially observable multi-agent settings with limited and asynchronous communications by learning communication protocols through a graph-transformer operating on a dynamic agent-entity graph. Each active agent maintains a local graph embedding via a two-layer UniMP graph transformer, while a centralized critic receives global graph representations to guide policy updates under a MAPPO/PPO framework; edges form only when proximity and synchronized actions occur, and inactive agents are masked. The approach achieves competitive success and collision rates while reducing inter-agent messages by up to ~26% in Cooperative Navigation and performing comparably to baselines in Rover-Tower, demonstrating robust performance under asynchronous communication constraints. This work advances practical MARL for space missions and planetary rovers by balancing communication efficiency with coordination quality, and highlights how dynamic graph attention balances proximity and communication frequency in evolving teams.
Abstract
We consider the problem setting in which multiple autonomous agents must cooperatively navigate and perform tasks in an unknown, communication-constrained environment. Traditional multi-agent reinforcement learning (MARL) approaches assume synchronous communications and perform poorly in such environments. We propose AsynCoMARL, an asynchronous MARL approach that uses graph transformers to learn communication protocols from dynamic graphs. AsynCoMARL can accommodate infrequent and asynchronous communications between agents, with edges of the graph only forming when agents communicate with each other. We show that AsynCoMARL achieves similar success and collision rates as leading baselines, despite 26\% fewer messages being passed between agents.
