FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

Wenzhe Li; Zihan Ding; Seth Karten; Chi Jin

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

Wenzhe Li, Zihan Ding, Seth Karten, Chi Jin

TL;DR

This work presents FightLadder, a real-time fighting game platform that provides meticulously designed environments to address critical challenges in competitive MARL research, aiming to catalyze a new era of discovery and advancement in the field.

Abstract

Recent advances in reinforcement learning (RL) heavily rely on a variety of well-designed benchmarks, which provide environmental platforms and consistent criteria to evaluate existing and novel algorithms. Specifically, in multi-agent RL (MARL), a plethora of benchmarks based on cooperative games have spurred the development of algorithms that improve the scalability of cooperative multi-agent systems. However, for the competitive setting, a lightweight and open-sourced benchmark with challenging gaming dynamics and visual inputs has not yet been established. In this work, we present FightLadder, a real-time fighting game platform, to empower competitive MARL research. Along with the platform, we provide implementations of state-of-the-art MARL algorithms for competitive games, as well as a set of evaluation metrics to characterize the performance and exploitability of agents. We demonstrate the feasibility of this platform by training a general agent that consistently defeats 12 built-in characters in single-player mode, and expose the difficulty of training a non-exploitable agent without human knowledge and demonstrations in two-player mode. FightLadder provides meticulously designed environments to address critical challenges in competitive MARL research, aiming to catalyze a new era of discovery and advancement in the field. Videos and code at https://sites.google.com/view/fightladder/home.

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

TL;DR

Abstract

Paper Structure (51 sections, 6 equations, 15 figures, 7 tables, 1 algorithm)

This paper contains 51 sections, 6 equations, 15 figures, 7 tables, 1 algorithm.

Introduction
Related Work
MARL Environments.
MARL Algorithms and Evaluation.
Multi-Agent Reinforcement Learning
FightLadder
Scenarios
State and Observations
Action Space
Rewards
Sparse Reward.
Win Rate.
Shaped Dense Reward.
Features
Rich Strategy Space.
...and 36 more sections

Figures (15)

Figure 1: FightLadder currently supports various cross-platform video fighting games: Street Fighter II (Genesis platform), Street Fighter III (Arcade platform), Fatal Fury 2 (Genesis platform), Mortal Kombat (Genesis platform), and The King of Fighters '97 (Neo Geo platform).
Figure 2: Motion and attack action spaces of fighting games. Images are adapted from Instruction Manual of Street Fighter II.
Figure 3: Example of special moves for character Ryu in StreetFighter II (left to right): Fireball, Dragon Punch, Hurricane Kick. Images are adapted from Instruction Manual of Street Fighter II.
Figure 4: The win rate curves and scheduling distribution bar plot in sf_ryu_full_game via the proposed PPO with curriculum learning. Opponents of different characters are marked with different levels. Levels 4, 8, and 12 are omitted as they are bonus levels without fighting.
Figure 5: The payoff matrix for each pair of agents at a certain stage of League training. For league training, there is one main agent (MA), two league exploiters (LE0, LE1), and one main exploiter (ME) for each side (left or right). The name of each row indicates the agent information as Character_Side_Checkpoint. Checkpoint=h_xM represents a historical version of agent saved at x million steps. The value indicates the win rate of the left (row) player against the right (column) player. For instance, ME0_right wins all MA0_left_h_xM with high probability, indicating that main exploiters in the league can fully exploit previous main agents. Also the high win rate of MA0_left against all right agents (except MA0_right) shows that the main agent at current steps outperforms other agents in the league.
...and 10 more figures

Theorems & Definitions (3)

Definition 3.1: Best Response
Definition 3.2: Nash Equilibrium
Definition 3.3: Exploitability

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

TL;DR

Abstract

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (15)

Theorems & Definitions (3)