StarCraft+: Benchmarking Multi-agent Algorithms in Adversary Paradigm
Yadong Li, Tong Zhang, Bo Huang, Zhen Cui
TL;DR
SC2BA introduces a StarCraft II–based adversarial MARL benchmark to address the limited opponent diversity of fixed AI bots. It provides two adversary modes (dual-algorithm paired and multi-algorithm mixed) and the APyMARL framework to enable algorithm-vs-algorithm training and evaluation. Extensive experiments across symmetric and asymmetric scenarios reveal that adversarial diversity can boost robustness, generalization, and learning dynamics, while also exposing challenges in scalability and heterogeneity. The open-source platform aims to catalyze advances in robust, diverse, and scalable multi-agent reinforcement learning research.
Abstract
Deep multi-agent reinforcement learning (MARL) algorithms are booming in the field of collaborative intelligence, and StarCraft multi-agent challenge (SMAC) is widely-used as the benchmark therein. However, imaginary opponents of MARL algorithms are practically configured and controlled in a fixed built-in AI mode, which causes less diversity and versatility in algorithm evaluation. To address this issue, in this work, we establish a multi-agent algorithm-vs-algorithm environment, named StarCraft II battle arena (SC2BA), to refresh the benchmarking of MARL algorithms in an adversary paradigm. Taking StarCraft as infrastructure, the SC2BA environment is specifically created for inter-algorithm adversary with the consideration of fairness, usability and customizability, and meantime an adversarial PyMARL (APyMARL) library is developed with easy-to-use interfaces/modules. Grounding in SC2BA, we benchmark those classic MARL algorithms in two types of adversarial modes: dual-algorithm paired adversary and multi-algorithm mixed adversary, where the former conducts the adversary of pairwise algorithms while the latter focuses on the adversary to multiple behaviors from a group of algorithms. The extensive benchmark experiments exhibit some thought-provoking observations/problems in the effectivity, sensibility and scalability of these completed algorithms. The SC2BA environment as well as reproduced experiments are released in \href{https://github.com/dooliu/SC2BA}{Github}, and we believe that this work could mark a new step for the MARL field in the coming years.
