Table of Contents
Fetching ...

StarCraft+: Benchmarking Multi-agent Algorithms in Adversary Paradigm

Yadong Li, Tong Zhang, Bo Huang, Zhen Cui

TL;DR

SC2BA introduces a StarCraft II–based adversarial MARL benchmark to address the limited opponent diversity of fixed AI bots. It provides two adversary modes (dual-algorithm paired and multi-algorithm mixed) and the APyMARL framework to enable algorithm-vs-algorithm training and evaluation. Extensive experiments across symmetric and asymmetric scenarios reveal that adversarial diversity can boost robustness, generalization, and learning dynamics, while also exposing challenges in scalability and heterogeneity. The open-source platform aims to catalyze advances in robust, diverse, and scalable multi-agent reinforcement learning research.

Abstract

Deep multi-agent reinforcement learning (MARL) algorithms are booming in the field of collaborative intelligence, and StarCraft multi-agent challenge (SMAC) is widely-used as the benchmark therein. However, imaginary opponents of MARL algorithms are practically configured and controlled in a fixed built-in AI mode, which causes less diversity and versatility in algorithm evaluation. To address this issue, in this work, we establish a multi-agent algorithm-vs-algorithm environment, named StarCraft II battle arena (SC2BA), to refresh the benchmarking of MARL algorithms in an adversary paradigm. Taking StarCraft as infrastructure, the SC2BA environment is specifically created for inter-algorithm adversary with the consideration of fairness, usability and customizability, and meantime an adversarial PyMARL (APyMARL) library is developed with easy-to-use interfaces/modules. Grounding in SC2BA, we benchmark those classic MARL algorithms in two types of adversarial modes: dual-algorithm paired adversary and multi-algorithm mixed adversary, where the former conducts the adversary of pairwise algorithms while the latter focuses on the adversary to multiple behaviors from a group of algorithms. The extensive benchmark experiments exhibit some thought-provoking observations/problems in the effectivity, sensibility and scalability of these completed algorithms. The SC2BA environment as well as reproduced experiments are released in \href{https://github.com/dooliu/SC2BA}{Github}, and we believe that this work could mark a new step for the MARL field in the coming years.

StarCraft+: Benchmarking Multi-agent Algorithms in Adversary Paradigm

TL;DR

SC2BA introduces a StarCraft II–based adversarial MARL benchmark to address the limited opponent diversity of fixed AI bots. It provides two adversary modes (dual-algorithm paired and multi-algorithm mixed) and the APyMARL framework to enable algorithm-vs-algorithm training and evaluation. Extensive experiments across symmetric and asymmetric scenarios reveal that adversarial diversity can boost robustness, generalization, and learning dynamics, while also exposing challenges in scalability and heterogeneity. The open-source platform aims to catalyze advances in robust, diverse, and scalable multi-agent reinforcement learning research.

Abstract

Deep multi-agent reinforcement learning (MARL) algorithms are booming in the field of collaborative intelligence, and StarCraft multi-agent challenge (SMAC) is widely-used as the benchmark therein. However, imaginary opponents of MARL algorithms are practically configured and controlled in a fixed built-in AI mode, which causes less diversity and versatility in algorithm evaluation. To address this issue, in this work, we establish a multi-agent algorithm-vs-algorithm environment, named StarCraft II battle arena (SC2BA), to refresh the benchmarking of MARL algorithms in an adversary paradigm. Taking StarCraft as infrastructure, the SC2BA environment is specifically created for inter-algorithm adversary with the consideration of fairness, usability and customizability, and meantime an adversarial PyMARL (APyMARL) library is developed with easy-to-use interfaces/modules. Grounding in SC2BA, we benchmark those classic MARL algorithms in two types of adversarial modes: dual-algorithm paired adversary and multi-algorithm mixed adversary, where the former conducts the adversary of pairwise algorithms while the latter focuses on the adversary to multiple behaviors from a group of algorithms. The extensive benchmark experiments exhibit some thought-provoking observations/problems in the effectivity, sensibility and scalability of these completed algorithms. The SC2BA environment as well as reproduced experiments are released in \href{https://github.com/dooliu/SC2BA}{Github}, and we believe that this work could mark a new step for the MARL field in the coming years.

Paper Structure

This paper contains 22 sections, 11 figures, 2 tables.

Figures (11)

  • Figure 1: The details of SC2BA platform, shown with its sub-module: configuration module, interaction module, bottom-level control module. Please see details in Section \ref{['sce:sc2ba_overview']}
  • Figure 2: Overview of APyMARL framework. More details can be found in Section \ref{['sec:lib']}.
  • Figure 3: The overall result of eight algorithms pairwise combat in seven symmetric scenarios under dual-algorithm adversary mode. Left: The median test win rates, averaged across all 7 scenarios. Right: The number of scenarios in which the algorithm outperform other algorithms (median test win rate is highest by at least 1/32 and smoothed).
  • Figure 4: Win rates for eight algorithms pairwise combat in seven symmetric scenarios(3m, 8m, 2s3z, 3s5z, MMM, 1c3s5z, and 25m) under dual-algorithm adversary mode. In one scenario, one algorithm(such as QMIX) will pairwise combat eight algorithms(include itself), then average eight win rates as the adversarial performance. Easy scenarios include 3m, 8m, and 2s3z, and hard scenarios consist of 3s5z, MMM, 1c3s5z, and 25m.
  • Figure 5: The troop difference influence in dual-algorithm adversary mode. The results of all algorithms compete against the COMA in 5m_vs_6m are plotted here, including median win rate as well as median returns. The median returns are normalized to a range of 0-1.
  • ...and 6 more figures