Inverse Attention Agents for Multi-Agent Systems
Qian Long, Ruoyan Li, Minglu Zhao, Tao Gao, Demetri Terzopoulos
TL;DR
This work tackles ad-hoc coordination in multi-agent reinforcement learning by introducing Inverse Attention Agents that model attention—via a Theory of Mind lens—as a core mental state. It combines a gradient-field representation of goals with a self-attention policy and an inverse-attention network that infers other agents’ attentional weights to update its own actions, enabling generalization to unseen teammates and humans. The approach is trained in three phases: (1) self-attention to learn goal weights, (2) inverse-attention to predict others’ attentions, and (3) updating own attention using inferred weights, all within fully observable Markov games and gradient-field observations. Empirical results across five continuous MPE environments and human experiments show that Inverse Attention Agents outperform baselines in cooperative, competitive, and mixed settings, particularly in ad-hoc collaborations, with strong prediction accuracy for others’ attentional focus. This demonstrates a practical path toward robust, human-compatible multi-agent systems that can adapt to diverse teammates and opponents in real time.
Abstract
A major challenge for Multi-Agent Systems is enabling agents to adapt dynamically to diverse environments in which opponents and teammates may continually change. Agents trained using conventional methods tend to excel only within the confines of their training cohorts; their performance drops significantly when confronting unfamiliar agents. To address this shortcoming, we introduce Inverse Attention Agents that adopt concepts from the Theory of Mind (ToM) implemented algorithmically using an attention mechanism trained in an end-to-end manner. Crucial to determining the final actions of these agents, the weights in their attention model explicitly represent attention to different goals. We furthermore propose an inverse attention network that deduces the ToM of agents based on observations and prior actions. The network infers the attentional states of other agents, thereby refining the attention weights to adjust the agent's final action. We conduct experiments in a continuous environment, tackling demanding tasks encompassing cooperation, competition, and a blend of both. They demonstrate that the inverse attention network successfully infers the attention of other agents, and that this information improves agent performance. Additional human experiments show that, compared to baseline agent models, our inverse attention agents exhibit superior cooperation with humans and better emulate human behaviors.
