Table of Contents
Fetching ...

Cooperative Multi-Agent Learning for Navigation via Structured State Abstraction

Mohamed K. Abdelaziz, Mohammed S. Elbamby, Sumudu Samarakoon, Mehdi Bennis

TL;DR

The paper tackles cooperative navigation with limited FoV by jointly learning an adaptive, structure-aware state abstraction and a communication protocol among agents. It introduces a neural architecture based on a Graph Isomorphism Network to encode full-resolution quadtree representations, and uses a straight-through Gumbel-softmax to selectively prune the state space while learning policies via decentralized A3C-style training. Empirical results show that adaptive abstraction accelerates learning, yields higher rewards, and generalizes to substantial observation noise and channel erasure, while emergent communication further boosts performance. This approach offers practical benefits for scalable multi-agent navigation in uncertain environments, with demonstrated robustness and faster convergence due to reduced state complexity and learned coordination signals.

Abstract

Cooperative multi-agent reinforcement learning (MARL) for navigation enables agents to cooperate to achieve their navigation goals. Using emergent communication, agents learn a communication protocol to coordinate and share information that is needed to achieve their navigation tasks. In emergent communication, symbols with no pre-specified usage rules are exchanged, in which the meaning and syntax emerge through training. Learning a navigation policy along with a communication protocol in a MARL environment is highly complex due to the huge state space to be explored. To cope with this complexity, this work proposes a novel neural network architecture, for jointly learning an adaptive state space abstraction and a communication protocol among agents participating in navigation tasks. The goal is to come up with an adaptive abstractor that significantly reduces the size of the state space to be explored, without degradation in the policy performance. Simulation results show that the proposed method reaches a better policy, in terms of achievable rewards, resulting in fewer training iterations compared to the case where raw states or fixed state abstraction are used. Moreover, it is shown that a communication protocol emerges during training which enables the agents to learn better policies within fewer training iterations.

Cooperative Multi-Agent Learning for Navigation via Structured State Abstraction

TL;DR

The paper tackles cooperative navigation with limited FoV by jointly learning an adaptive, structure-aware state abstraction and a communication protocol among agents. It introduces a neural architecture based on a Graph Isomorphism Network to encode full-resolution quadtree representations, and uses a straight-through Gumbel-softmax to selectively prune the state space while learning policies via decentralized A3C-style training. Empirical results show that adaptive abstraction accelerates learning, yields higher rewards, and generalizes to substantial observation noise and channel erasure, while emergent communication further boosts performance. This approach offers practical benefits for scalable multi-agent navigation in uncertain environments, with demonstrated robustness and faster convergence due to reduced state complexity and learned coordination signals.

Abstract

Cooperative multi-agent reinforcement learning (MARL) for navigation enables agents to cooperate to achieve their navigation goals. Using emergent communication, agents learn a communication protocol to coordinate and share information that is needed to achieve their navigation tasks. In emergent communication, symbols with no pre-specified usage rules are exchanged, in which the meaning and syntax emerge through training. Learning a navigation policy along with a communication protocol in a MARL environment is highly complex due to the huge state space to be explored. To cope with this complexity, this work proposes a novel neural network architecture, for jointly learning an adaptive state space abstraction and a communication protocol among agents participating in navigation tasks. The goal is to come up with an adaptive abstractor that significantly reduces the size of the state space to be explored, without degradation in the policy performance. Simulation results show that the proposed method reaches a better policy, in terms of achievable rewards, resulting in fewer training iterations compared to the case where raw states or fixed state abstraction are used. Moreover, it is shown that a communication protocol emerges during training which enables the agents to learn better policies within fewer training iterations.
Paper Structure (16 sections, 3 equations, 12 figures, 1 table)

This paper contains 16 sections, 3 equations, 12 figures, 1 table.

Figures (12)

  • Figure 1: A $15\times 15$ grid world (left) with obstacles (black tiles) and two agents (circles) having a partial observation of size $8\times8$. Agents cooperatively navigate to the destination (green tile) by only utilizing their partial observation and communicated symbols (right).
  • Figure 2: Different levels of quadtree representation (top) with the corresponding observations (bottom).
  • Figure 3: Neural network architecture of the proposed abstractor at each agent.
  • Figure 4: An illustration of the $k^{\text{th}}$ iteration (or layer) of a general GNN model.
  • Figure 5: The reward achieved by agent $0$ throughout training.
  • ...and 7 more figures