Table of Contents
Fetching ...

Simulation-Driven Balancing of Competitive Game Levels with Reinforcement Learning

Florian Rupp, Manuel Eberhardinger, Kai Eckert

TL;DR

The paper tackles automatic balancing of tile-based, competitive two-player game levels via a domain-independent PCG approach using reinforcement learning. It introduces swap-based representations within the PCGRL framework, separating level generation from balancing to fine-tune pre-made levels toward a target balance measured by a fairness-inspired reward derived from statistical parity. Through extensive experiments in the Neural MMO environment, the approach achieves higher balanced-level percentages (up to 68%) and efficiently identifies which tile types most impact balance, outperforming hill-climbing baselines. The work also explores balancing toward different targets, asymmetries, and the applicability of fairness metrics, offering a practical, scalable method for balance-aware level design with potential extensions to other domains. Overall, the method enables rapid, data-driven balancing while providing insights into design decisions that affect competitive fairness.

Abstract

The balancing process for game levels in competitive two-player contexts involves a lot of manual work and testing, particularly for non-symmetrical game levels. In this work, we frame game balancing as a procedural content generation task and propose an architecture for automatically balancing of tile-based levels within the PCGRL framework (procedural content generation via reinforcement learning). Our architecture is divided into three parts: (1) a level generator, (2) a balancing agent, and (3) a reward modeling simulation. Through repeated simulations, the balancing agent receives rewards for adjusting the level towards a given balancing objective, such as equal win rates for all players. To this end, we propose new swap-based representations to improve the robustness of playability, thereby enabling agents to balance game levels more effectively and quickly compared to traditional PCGRL. By analyzing the agent's swapping behavior, we can infer which tile types have the most impact on the balance. We validate our approach in the Neural MMO (NMMO) environment in a competitive two-player scenario. In this extended conference paper, we present improved results, explore the applicability of the method to various forms of balancing beyond equal balancing, compare the performance to another search-based approach, and discuss the application of existing fairness metrics to game balancing.

Simulation-Driven Balancing of Competitive Game Levels with Reinforcement Learning

TL;DR

The paper tackles automatic balancing of tile-based, competitive two-player game levels via a domain-independent PCG approach using reinforcement learning. It introduces swap-based representations within the PCGRL framework, separating level generation from balancing to fine-tune pre-made levels toward a target balance measured by a fairness-inspired reward derived from statistical parity. Through extensive experiments in the Neural MMO environment, the approach achieves higher balanced-level percentages (up to 68%) and efficiently identifies which tile types most impact balance, outperforming hill-climbing baselines. The work also explores balancing toward different targets, asymmetries, and the applicability of fairness metrics, offering a practical, scalable method for balance-aware level design with potential extensions to other domains. Overall, the method enables rapid, data-driven balancing while providing insights into design decisions that affect competitive fairness.

Abstract

The balancing process for game levels in competitive two-player contexts involves a lot of manual work and testing, particularly for non-symmetrical game levels. In this work, we frame game balancing as a procedural content generation task and propose an architecture for automatically balancing of tile-based levels within the PCGRL framework (procedural content generation via reinforcement learning). Our architecture is divided into three parts: (1) a level generator, (2) a balancing agent, and (3) a reward modeling simulation. Through repeated simulations, the balancing agent receives rewards for adjusting the level towards a given balancing objective, such as equal win rates for all players. To this end, we propose new swap-based representations to improve the robustness of playability, thereby enabling agents to balance game levels more effectively and quickly compared to traditional PCGRL. By analyzing the agent's swapping behavior, we can infer which tile types have the most impact on the balance. We validate our approach in the Neural MMO (NMMO) environment in a competitive two-player scenario. In this extended conference paper, we present improved results, explore the applicability of the method to various forms of balancing beyond equal balancing, compare the performance to another search-based approach, and discuss the application of existing fairness metrics to game balancing.

Paper Structure

This paper contains 32 sections, 5 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: In our game environment, two players must forage for resources like food (dark green) and water (blue) to survive longest. By swapping the highlighted tiles (red), the trained model achieved a more balanced game in simulated game runs.
  • Figure 2: Description of the balancing architecture. It is separated into three units: A level generator, a level balancing agent, and a game playing simulation. In the latter, the game is simulated by playing it $n$-times with heuristic player agents. The reward $r_t$ for training the balancing agent is computed from the balance states $b_t$ and $b_{t-1}$ of the simulations.
  • Figure 3: How many times $n$ should we run the simulation to approximate the balance state? We figure that out by calculating the mean deviation $\mu_n$ of win rates from $n-2$ to $n$ for the investigation of fluctuations in win rates.
  • Figure 4: Distribution of the initial balance states based on the players' win rates in the generated dataset of 1000 levels. We use this dataset to compare the different representations.
  • Figure 5: Comparison of the balance state distributions before and after the balancing process for each representation using the dataset of 1000 levels (Figure \ref{['fig:init-balancing']}). The swap representations (a) are compared to the original PCGRL implementation directly (b).
  • ...and 2 more figures