Attention-Guided Contrastive Role Representations for Multi-Agent Reinforcement Learning
Zican Hu, Zongzhang Zhang, Huaxiong Li, Chunlin Chen, Hongyu Ding, Zhi Wang
TL;DR
The paper tackles homogeneous behaviors and evolving role dynamics in cooperative MARL under CTDE. It introduces ACORM, a framework that learns discriminative role representations through mutual-information–based contrastive learning and integrates them into value decomposition via an attention mechanism for expressive credit assignment. Empirical results on StarCraft II SMAC and Google Research Football show state-of-the-art performance and robust coordination, with ablations confirming the distinct contributions of contrastive role representations and attention-guided coordination. Visualizations further reveal meaningful role emergence and attention patterns that align with strategic team coordination, suggesting strong practical impact for complex multi-agent systems.
Abstract
Real-world multi-agent tasks usually involve dynamic team composition with the emergence of roles, which should also be a key to efficient cooperation in multi-agent reinforcement learning (MARL). Drawing inspiration from the correlation between roles and agent's behavior patterns, we propose a novel framework of **A**ttention-guided **CO**ntrastive **R**ole representation learning for **M**ARL (**ACORM**) to promote behavior heterogeneity, knowledge transfer, and skillful coordination across agents. First, we introduce mutual information maximization to formalize role representation learning, derive a contrastive learning objective, and concisely approximate the distribution of negative pairs. Second, we leverage an attention mechanism to prompt the global state to attend to learned role representations in value decomposition, implicitly guiding agent coordination in a skillful role space to yield more expressive credit assignment. Experiments on challenging StarCraft II micromanagement and Google research football tasks demonstrate the state-of-the-art performance of our method and its advantages over existing approaches. Our code is available at [https://github.com/NJU-RL/ACORM](https://github.com/NJU-RL/ACORM).
