N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning

Zichuan Liu; Yuanyang Zhu; Chunlin Chen

N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning

Zichuan Liu, Yuanyang Zhu, Chunlin Chen

TL;DR

This work addresses the opaque credit assignment problem in cooperative MARL by introducing NA2Q, a neural additive model-based value decomposition that renders decisions transparent. NA2Q learns unary and pairwise shape functions to decompose the joint value, augmented with identity semantics via a VAE to provide interpretable local observations and backdoor-adjusted credits through attention. Theoretical analysis yields regret bounds for the enriched decomposition, and extensive experiments on Level Based Foraging and SMAC demonstrate both strong performance and interpretable decision-making, including visual masks that reveal what agents attend to. Overall, NA2Q advances interpretable coordination in multi-agent systems while maintaining competitive performance and providing diagnostic tools for understanding agent behavior.

Abstract

Value decomposition is widely used in cooperative multi-agent reinforcement learning, however, its implicit credit assignment mechanism is not yet fully understood due to black-box networks. In this work, we study an interpretable value decomposition framework via the family of generalized additive models. We present a novel method, named Neural Attention Additive Q-learning (N$\text{A}^\text{2}$Q), providing inherent intelligibility of collaboration behavior. N$\text{A}^\text{2}$Q can explicitly factorize the optimal joint policy induced by enriching shape functions to model all possible coalitions of agents into individual policies. Moreover, we construct identity semantics to promote estimating credits together with the global state and individual value functions, where local semantic masks help us diagnose whether each agent captures relevant-task information. Extensive experiments show that N$\text{A}^\text{2}$Q consistently achieves superior performance compared to different state-of-the-art methods on all challenging tasks, while yielding human-like interpretability.

N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning

TL;DR

Abstract

Q), providing inherent intelligibility of collaboration behavior. N

Q can explicitly factorize the optimal joint policy induced by enriching shape functions to model all possible coalitions of agents into individual policies. Moreover, we construct identity semantics to promote estimating credits together with the global state and individual value functions, where local semantic masks help us diagnose whether each agent captures relevant-task information. Extensive experiments show that N

Q consistently achieves superior performance compared to different state-of-the-art methods on all challenging tasks, while yielding human-like interpretability.

Paper Structure (25 sections, 2 theorems, 32 equations, 12 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 2 theorems, 32 equations, 12 figures, 4 tables, 1 algorithm.

Introduction
Preliminaries
Dec-POMDP
Credit Assignment in MARL
Theoretical Analysis for Decomposition
Neural Attention Additive Q-learning
Experiments
Level Based Foraging
StarCraft Multi-Agent Challenge
Ablation Study
Conclusion
Acknowledgements
Credit Assignment for Value Decomposition Algorithms
Approximation Guarantees for N$\text{A}^\text{2}$Q
Variational Auto-Encoder Background
...and 10 more sections

Key Result

Theorem 2.2

Let $\ell$ be 1-Lipschitz, $\delta \in (0, 1]$ and Assumption ass1 hold with constants $\{C_1, C_2, \eta\}$. Then, for $L_1$-norm models, where $\left\| \boldsymbol{a}_{ld} \right\|_1\leq B_a, 1\leq l \leq n$, and $\left\| \boldsymbol{\lambda} \right\|_1\leq B_\lambda$ where $\boldsymbol{\lambda}=\{

Figures (12)

Figure 1: An example of value decomposition via the GAMs family in MARL, where $\boldsymbol{s}\in \mathcal{S}$ is the global state, $f_k\in \{f_1, \cdots, f_{1 \ldots n}\}$ denotes the contribution of a shape function to learning individual or pairwise action values, and $Q_{tot}$ denotes the joint action value.
Figure 2: The overall framework of N$\text{A}^\text{2}$Q. First, each agent receives its local action-observation history $\tau_i$ and models its individual value function $Q_i(\tau_i, u_i)$. Next, we construct the identity semantic $z_i$ by encoding $\tau_i$, and take it together with the global state $\boldsymbol{s}$ to estimate credits, which provides a captured semantic interpretation. In the mixing network, we transform the local Q-values $[Q_i]_{i=1}^n$ into temporal Q-values $[\widehat{Q}_k]_{k=1}^m$ by the shape function $f_k$ within order-$l$, where $l\in \mathcal{N}$, which are used to predict the joint Q-value together with credits.
Figure 3: Average test return on two constructed tasks of LBF.
Figure 4: Test win rate % on hard (first row), and super hard (second row) maps of SMAC benchmark.
Figure 5: Visualization of the agent's mask at step 4, and the title indicates the corresponding credit assignment. The highlighted areas are the important regions for making decisions.
...and 7 more figures

Theorems & Definitions (4)

Theorem 2.2
proof
Lemma 2.3
proof

N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning

TL;DR

Abstract

N$\text{A}^\text{2}$Q: Neural Attention Additive Model for Interpretable Multi-Agent Q-Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (4)