Table of Contents
Fetching ...

Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning

Simin Li, Zihao Mao, Hanxiao Li, Zonglei Jing, Zhuohang bian, Jun Guo, Li Wang, Zhuoran Han, Ruixiao Xu, Xin Yu, Chengdong Ma, Yuqing Ma, Bo An, Yaodong Yang, Weifeng Lv, Xianglong Liu

TL;DR

This work tackles the gap between training-time cooperation and deployment-time robustness and resilience in cooperative multi-agent reinforcement learning. It conducts a massive empirical study across 4 real-world environments, 13 uncertainty modalities, and 15 hyperparameters to distinguish robustness from resilience and to quantify how hyperparameter choices impact trustworthy MARL. Key findings show that cooperation's benefits for robustness and resilience weaken under stronger perturbations and do not generalize across uncertainty modalities or agent scopes, while hyperparameter tuning—often more than algorithm choice—drives substantial performance gains and generalizes to robust MARL methods. The results provide practical guidance on hyperparameter selection (e.g., early stopping, higher critic LR, Leaky ReLU) and emphasize evaluating across diverse uncertainties for trustworthy MARL systems.

Abstract

In cooperative Multi-Agent Reinforcement Learning (MARL), it is a common practice to tune hyperparameters in ideal simulated environments to maximize cooperative performance. However, policies tuned for cooperation often fail to maintain robustness and resilience under real-world uncertainties. Building trustworthy MARL systems requires a deep understanding of robustness, which ensures stability under uncertainties, and resilience, the ability to recover from disruptions--a concept extensively studied in control systems but largely overlooked in MARL. In this paper, we present a large-scale empirical study comprising over 82,620 experiments to evaluate cooperation, robustness, and resilience in MARL across 4 real-world environments, 13 uncertainty types, and 15 hyperparameters. Our key findings are: (1) Under mild uncertainty, optimizing cooperation improves robustness and resilience, but this link weakens as perturbations intensify. Robustness and resilience also varies by algorithm and uncertainty type. (2) Robustness and resilience do not generalize across uncertainty modalities or agent scopes: policies robust to action noise for all agents may fail under observation noise on a single agent. (3) Hyperparameter tuning is critical for trustworthy MARL: surprisingly, standard practices like parameter sharing, GAE, and PopArt can hurt robustness, while early stopping, high critic learning rates, and Leaky ReLU consistently help. By optimizing hyperparameters only, we observe substantial improvement in cooperation, robustness and resilience across all MARL backbones, with the phenomenon also generalizing to robust MARL methods across these backbones. Code and results available at https://github.com/BUAA-TrustworthyMARL/adv_marl_benchmark .

Empirical Study on Robustness and Resilience in Cooperative Multi-Agent Reinforcement Learning

TL;DR

This work tackles the gap between training-time cooperation and deployment-time robustness and resilience in cooperative multi-agent reinforcement learning. It conducts a massive empirical study across 4 real-world environments, 13 uncertainty modalities, and 15 hyperparameters to distinguish robustness from resilience and to quantify how hyperparameter choices impact trustworthy MARL. Key findings show that cooperation's benefits for robustness and resilience weaken under stronger perturbations and do not generalize across uncertainty modalities or agent scopes, while hyperparameter tuning—often more than algorithm choice—drives substantial performance gains and generalizes to robust MARL methods. The results provide practical guidance on hyperparameter selection (e.g., early stopping, higher critic LR, Leaky ReLU) and emphasize evaluating across diverse uncertainties for trustworthy MARL systems.

Abstract

In cooperative Multi-Agent Reinforcement Learning (MARL), it is a common practice to tune hyperparameters in ideal simulated environments to maximize cooperative performance. However, policies tuned for cooperation often fail to maintain robustness and resilience under real-world uncertainties. Building trustworthy MARL systems requires a deep understanding of robustness, which ensures stability under uncertainties, and resilience, the ability to recover from disruptions--a concept extensively studied in control systems but largely overlooked in MARL. In this paper, we present a large-scale empirical study comprising over 82,620 experiments to evaluate cooperation, robustness, and resilience in MARL across 4 real-world environments, 13 uncertainty types, and 15 hyperparameters. Our key findings are: (1) Under mild uncertainty, optimizing cooperation improves robustness and resilience, but this link weakens as perturbations intensify. Robustness and resilience also varies by algorithm and uncertainty type. (2) Robustness and resilience do not generalize across uncertainty modalities or agent scopes: policies robust to action noise for all agents may fail under observation noise on a single agent. (3) Hyperparameter tuning is critical for trustworthy MARL: surprisingly, standard practices like parameter sharing, GAE, and PopArt can hurt robustness, while early stopping, high critic learning rates, and Leaky ReLU consistently help. By optimizing hyperparameters only, we observe substantial improvement in cooperation, robustness and resilience across all MARL backbones, with the phenomenon also generalizing to robust MARL methods across these backbones. Code and results available at https://github.com/BUAA-TrustworthyMARL/adv_marl_benchmark .

Paper Structure

This paper contains 36 sections, 3 equations, 16 figures, 6 tables.

Figures (16)

  • Figure 1: While optimizing hyperparameters to improve cooperation, the algorithm gets significantly less robust and resilient when uncertainty occurs.
  • Figure 2: Relation between cooperation, robustness, and resilience under uncertainty. Cooperative MARL is trained without perturbations, but must be robust and resilient when they occur.
  • Figure 3: Correlation between cooperation, robustness, and resilience under uncertainty types and algorithms.
  • Figure 4: The extent cooperation is correlated to robustness and resilience linearly depends on the severity of the attack.
  • Figure 5: The self-correlation between 13 types of uncertainties in terms of robustness and resilience. Correlations in uncertainties differ significantly across modalities (observations, actions, environments) and scope of agents (applied to individual or all agents).
  • ...and 11 more figures