Table of Contents
Fetching ...

Deep Meta Coordination Graphs for Multi-agent Reinforcement Learning

Nikunj Gupta, James Zachary Hare, Rajgopal Kannan, Viktor Prasanna

TL;DR

This work introduces Deep Meta Coordination Graphs (DMCG) to address cooperative MARL by jointly modeling higher-order and indirect agent interactions through dynamic meta coordination graphs. DMCG constructs multiple interaction-type graphs, selectively combines them, and applies graph convolutions to produce expressive agent representations that feed into a DCG-style value factorization with per-agent and pairwise terms. The approach effectively mitigates miscoordination and relative overgeneralization, delivering strong performance and sample efficiency on MACO tasks and scalable results on SMACv2. Overall, DMCG demonstrates the importance of learning adaptive, multi-hop interaction structures for robust, scalable coordination in complex multi-agent environments.

Abstract

This paper presents deep meta coordination graphs (DMCG) for learning cooperative policies in multi-agent reinforcement learning (MARL). Coordination graph formulations encode local interactions and accordingly factorize the joint value function of all agents to improve efficiency in MARL. However, existing approaches rely solely on pairwise relations between agents, which potentially oversimplifies complex multi-agent interactions. DMCG goes beyond these simple direct interactions by also capturing useful higher-order and indirect relationships among agents. It generates novel graph structures accommodating multiple types of interactions and arbitrary lengths of multi-hop connections in coordination graphs to model such interactions. It then employs a graph convolutional network module to learn powerful representations in an end-to-end manner. We demonstrate its effectiveness in multiple coordination problems in MARL where other state-of-the-art methods can suffer from sample inefficiency or fail entirely. All codes can be found here: https://github.com/Nikunj-Gupta/dmcg-marl.

Deep Meta Coordination Graphs for Multi-agent Reinforcement Learning

TL;DR

This work introduces Deep Meta Coordination Graphs (DMCG) to address cooperative MARL by jointly modeling higher-order and indirect agent interactions through dynamic meta coordination graphs. DMCG constructs multiple interaction-type graphs, selectively combines them, and applies graph convolutions to produce expressive agent representations that feed into a DCG-style value factorization with per-agent and pairwise terms. The approach effectively mitigates miscoordination and relative overgeneralization, delivering strong performance and sample efficiency on MACO tasks and scalable results on SMACv2. Overall, DMCG demonstrates the importance of learning adaptive, multi-hop interaction structures for robust, scalable coordination in complex multi-agent environments.

Abstract

This paper presents deep meta coordination graphs (DMCG) for learning cooperative policies in multi-agent reinforcement learning (MARL). Coordination graph formulations encode local interactions and accordingly factorize the joint value function of all agents to improve efficiency in MARL. However, existing approaches rely solely on pairwise relations between agents, which potentially oversimplifies complex multi-agent interactions. DMCG goes beyond these simple direct interactions by also capturing useful higher-order and indirect relationships among agents. It generates novel graph structures accommodating multiple types of interactions and arbitrary lengths of multi-hop connections in coordination graphs to model such interactions. It then employs a graph convolutional network module to learn powerful representations in an end-to-end manner. We demonstrate its effectiveness in multiple coordination problems in MARL where other state-of-the-art methods can suffer from sample inefficiency or fail entirely. All codes can be found here: https://github.com/Nikunj-Gupta/dmcg-marl.

Paper Structure

This paper contains 32 sections, 12 equations, 7 figures.

Figures (7)

  • Figure 1: Introducing meta coordination graphs. (a) Coordination graphs model static pairwise interactions among agents. (b) We propose meta coordination graphs that introduce dynamic edge types ($e_k^t$) where $k$ denotes the type of interaction (color-coded) and evolves over time ($t$), enabling modeling of both higher-order and indirect interactions (dotted edges). This novel approach captures complex dependencies and cascading effects, offering a more adaptive and nuanced representation for multi-agent interactions.
  • Figure 2: Illustration of Deep meta coordination graphs (DMCG) for learning cooperative policies in MARL. The diagram shows how the approach captures and adapts to multiple types of interactions among agents, including higher-order and indirect relationships, and generates new graph structures by exploring dynamic interactions, even among initially unconnected agents.
  • Figure 3: Evaluation environments. The figure shows an example of (a) Gather with $g_1$ as the optimal goal and {$g_2$, $g_3$} as suboptimal goals, (b) Disperse illustrating a need for 6 agents at Hospital 1 at time $t$, (c) Pursuit where at least 2 predator agents (purple) must coordinate to capture a prey (yellow), (d) Hallway with 2 groups of agents, and (e) a SMACv2 scenario.
  • Figure 4: Performance comparison of DMCG against other algorithms in the Gather, Disperse, Pursuit, and Hallway. The results highlight DMCG's significant outperformance in Gather and Hallway, achieving near-perfect win rates and demonstrating superior sample efficiency. In Disperse, DMCG outperforms all methods while moderately surpassing DCG as well. In the more challenging Pursuit and Hallway tasks, DMCG effectively addresses issues like relative overgeneralization and miscoordination, proving its robustness in environments with partial observability and stochastic dynamics.
  • Figure 5: Scaling DMCG to SMACv2, highlighting key metrics such as test win rate, returns, and number of dead allies and enemies. The results suggest that DMCG adopts a cautious and strategic approach, prioritizing survival and long-term returns over immediate aggression.
  • ...and 2 more figures