Breaking the Curse of Multiagency in Robust Multi-Agent Reinforcement Learning
Laixi Shi, Jingchu Gai, Eric Mazumdar, Yuejie Chi, Adam Wierman
TL;DR
The work tackles robustness in multi-agent reinforcement learning by formulating distributionally robust Markov games (RMGs) with fictitious uncertainty sets that integrate environment dynamics and others’ behavior, motivated by behavioral economics. It proves the existence of robust equilibria (robust NE and robust CCE) for this class and introduces Robust-Q-FTRL, a sample-efficient algorithm that learns an $\varepsilon$-robust CCE under a generative model. The main theoretical result shows a polynomial, scalable sample complexity of $\tilde{O}\left(\frac{S H^6 \sum_i A_i}{\varepsilon^4} \min\left\{H, \frac{1}{\min_i \sigma_i}\right\}\right)$, breaking the curse of multiagency for RMGs across uncertainty-set definitions. This advances robust MARL by enabling practical, data-efficient learning in settings with realistic uncertainty about both the environment and other agents’ intentions, and it opens avenues for uncertainty-set design, equilibrium refinement, and broader applicability in risk-aware multi-agent systems.
Abstract
Standard multi-agent reinforcement learning (MARL) algorithms are vulnerable to sim-to-real gaps. To address this, distributionally robust Markov games (RMGs) have been proposed to enhance robustness in MARL by optimizing the worst-case performance when game dynamics shift within a prescribed uncertainty set. RMGs remains under-explored, from reasonable problem formulation to the development of sample-efficient algorithms. Two notorious and open challenges are the formulation of the uncertainty set and whether the corresponding RMGs can overcome the curse of multiagency, where the sample complexity scales exponentially with the number of agents. In this work, we propose a natural class of RMGs inspired by behavioral economics, where each agent's uncertainty set is shaped by both the environment and the integrated behavior of other agents. We first establish the well-posedness of this class of RMGs by proving the existence of game-theoretic solutions such as robust Nash equilibria and coarse correlated equilibria (CCE). Assuming access to a generative model, we then introduce a sample-efficient algorithm for learning the CCE whose sample complexity scales polynomially with all relevant parameters. To the best of our knowledge, this is the first algorithm to break the curse of multiagency for RMGs, regardless of the uncertainty set formulation.
