Table of Contents
Fetching ...

Inverse Attention Agents for Multi-Agent Systems

Qian Long, Ruoyan Li, Minglu Zhao, Tao Gao, Demetri Terzopoulos

TL;DR

This work tackles ad-hoc coordination in multi-agent reinforcement learning by introducing Inverse Attention Agents that model attention—via a Theory of Mind lens—as a core mental state. It combines a gradient-field representation of goals with a self-attention policy and an inverse-attention network that infers other agents’ attentional weights to update its own actions, enabling generalization to unseen teammates and humans. The approach is trained in three phases: (1) self-attention to learn goal weights, (2) inverse-attention to predict others’ attentions, and (3) updating own attention using inferred weights, all within fully observable Markov games and gradient-field observations. Empirical results across five continuous MPE environments and human experiments show that Inverse Attention Agents outperform baselines in cooperative, competitive, and mixed settings, particularly in ad-hoc collaborations, with strong prediction accuracy for others’ attentional focus. This demonstrates a practical path toward robust, human-compatible multi-agent systems that can adapt to diverse teammates and opponents in real time.

Abstract

A major challenge for Multi-Agent Systems is enabling agents to adapt dynamically to diverse environments in which opponents and teammates may continually change. Agents trained using conventional methods tend to excel only within the confines of their training cohorts; their performance drops significantly when confronting unfamiliar agents. To address this shortcoming, we introduce Inverse Attention Agents that adopt concepts from the Theory of Mind (ToM) implemented algorithmically using an attention mechanism trained in an end-to-end manner. Crucial to determining the final actions of these agents, the weights in their attention model explicitly represent attention to different goals. We furthermore propose an inverse attention network that deduces the ToM of agents based on observations and prior actions. The network infers the attentional states of other agents, thereby refining the attention weights to adjust the agent's final action. We conduct experiments in a continuous environment, tackling demanding tasks encompassing cooperation, competition, and a blend of both. They demonstrate that the inverse attention network successfully infers the attention of other agents, and that this information improves agent performance. Additional human experiments show that, compared to baseline agent models, our inverse attention agents exhibit superior cooperation with humans and better emulate human behaviors.

Inverse Attention Agents for Multi-Agent Systems

TL;DR

This work tackles ad-hoc coordination in multi-agent reinforcement learning by introducing Inverse Attention Agents that model attention—via a Theory of Mind lens—as a core mental state. It combines a gradient-field representation of goals with a self-attention policy and an inverse-attention network that infers other agents’ attentional weights to update its own actions, enabling generalization to unseen teammates and humans. The approach is trained in three phases: (1) self-attention to learn goal weights, (2) inverse-attention to predict others’ attentions, and (3) updating own attention using inferred weights, all within fully observable Markov games and gradient-field observations. Empirical results across five continuous MPE environments and human experiments show that Inverse Attention Agents outperform baselines in cooperative, competitive, and mixed settings, particularly in ad-hoc collaborations, with strong prediction accuracy for others’ attentional focus. This demonstrates a practical path toward robust, human-compatible multi-agent systems that can adapt to diverse teammates and opponents in real time.

Abstract

A major challenge for Multi-Agent Systems is enabling agents to adapt dynamically to diverse environments in which opponents and teammates may continually change. Agents trained using conventional methods tend to excel only within the confines of their training cohorts; their performance drops significantly when confronting unfamiliar agents. To address this shortcoming, we introduce Inverse Attention Agents that adopt concepts from the Theory of Mind (ToM) implemented algorithmically using an attention mechanism trained in an end-to-end manner. Crucial to determining the final actions of these agents, the weights in their attention model explicitly represent attention to different goals. We furthermore propose an inverse attention network that deduces the ToM of agents based on observations and prior actions. The network infers the attentional states of other agents, thereby refining the attention weights to adjust the agent's final action. We conduct experiments in a continuous environment, tackling demanding tasks encompassing cooperation, competition, and a blend of both. They demonstrate that the inverse attention network successfully infers the attention of other agents, and that this information improves agent performance. Additional human experiments show that, compared to baseline agent models, our inverse attention agents exhibit superior cooperation with humans and better emulate human behaviors.

Paper Structure

This paper contains 49 sections, 9 equations, 5 figures, 13 tables, 1 algorithm.

Figures (5)

  • Figure 1: Pipeline for training the inverse attention agent: The first phase involves applying a self-attention mechanism, where the agent assigns attention weights to its observations and acts based on these weights. In the second phase, the agent performs attention inference on other agents of the same type using the inverse attention network. By placing itself in the position of these agents, it infers their attention weights, gaining insights into their goals and behaviors. In the final phase, the inverse attention agent uses the inferred information from the previous step to update its original attention weights, $\{w_1,w_2,\dots,w_n\}$ to $\{\tilde{w}_1,\tilde{w}_2,\dots,\tilde{w}_n\}$, consequently leading to changes in its final actions.
  • Figure 2: Network architecture of the inverse attention agent. For agent $i$, $W_i$ is the observation embedding function which takes in the observation and outputs initial attention weights. $\textit{IW}_i$ is the inverse attention network which takes in the action and observation of the other agents and outputs the inferred attention weights. The $\textit{UW}_i$ takes consideration of self initial weights and inferred weights from others and update $a_i$'s attention weights. The $h_i$ function outputs the final action based on the updated weights.
  • Figure 3: Environment visualization of the spread, adversary, and grassland games
  • Figure 4: We evaluated the prediction accuracy of the inverse attention network across five roles in the spread, adversary, and grassland environments under the scale of {spread: 3, adversarial: $3-3$ and grassland: $3-3$}. In each bar graph, from left to right, we display the prediction accuracy from the most attended goal to the least attended goal. The results demonstrate that the inverse network can accurately predict the attentions of other agents, particularly for the top two attentions of interest.
  • Figure 5: Qualitative results in the spread, adversary, and grassland games in MPE demonstrate that the Inverse-Att agents can successfully adapt to unseen agents.