SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMAC

Yue Deng; Yan Yu; Weiyu Ma; Zirui Wang; Wenhui Zhu; Jian Zhao; Yin Zhang

SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMAC

Yue Deng, Yan Yu, Weiyu Ma, Zirui Wang, Wenhui Zhu, Jian Zhao, Yin Zhang

TL;DR

SMAC-HARD addresses benchmark saturation in MARL by introducing an editable, diverse opponent framework for SMAC, including opponent-script editing, randomized strategy mixing, and synchronized self-play interfaces. It combines an LLM-guided planning pipeline to generate opponent and agent decision trees, converted into pysc2-compatible scripts, with a black-box evaluation to test policy coverage on unseen adversaries. Experimental results show that leading MARL algorithms struggle under edited and mixed-opponent conditions, highlighting limited transferability when trained on a single strategy. By offering a robust, configurable testing ground and accompanying evaluation framework, SMAC-HARD aims to drive the development of more robust, self-play-oriented MARL methods with practical transferability to diverse opponents.

Abstract

The availability of challenging simulation environments is pivotal for advancing the field of Multi-Agent Reinforcement Learning (MARL). In cooperative MARL settings, the StarCraft Multi-Agent Challenge (SMAC) has gained prominence as a benchmark for algorithms following centralized training with decentralized execution paradigm. However, with continual advancements in SMAC, many algorithms now exhibit near-optimal performance, complicating the evaluation of their true effectiveness. To alleviate this problem, in this work, we highlight a critical issue: the default opponent policy in these environments lacks sufficient diversity, leading MARL algorithms to overfit and exploit unintended vulnerabilities rather than learning robust strategies. To overcome these limitations, we propose SMAC-HARD, a novel benchmark designed to enhance training robustness and evaluation comprehensiveness. SMAC-HARD supports customizable opponent strategies, randomization of adversarial policies, and interfaces for MARL self-play, enabling agents to generalize to varying opponent behaviors and improve model stability. Furthermore, we introduce a black-box testing framework wherein agents are trained without exposure to the edited opponent scripts but are tested against these scripts to evaluate the policy coverage and adaptability of MARL algorithms. We conduct extensive evaluations of widely used and state-of-the-art algorithms on SMAC-HARD, revealing the substantial challenges posed by edited and mixed strategy opponents. Additionally, the black-box strategy tests illustrate the difficulty of transferring learned policies to unseen adversaries. We envision SMAC-HARD as a critical step toward benchmarking the next generation of MARL algorithms, fostering progress in self-play methods for multi-agent systems. Our code is available at https://github.com/devindeng94/smac-hard.

SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMAC

TL;DR

Abstract

SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMAC

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)