Table of Contents
Fetching ...

Balancing of competitive two-player Game Levels with Reinforcement Learning

Florian Rupp, Manuel Eberhardinger, Kai Eckert

TL;DR

This work addresses the challenge of balancing competitive two-player tile-based game levels by introducing a domain-independent architecture within PCGRL that separates level generation from balancing. It employs swap-based representations (Swap-Narrow, Swap-Turtle, Swap-Wide) and a reward design driven by simulated balancing outcomes to modify given levels toward equal win rates, demonstrated in the Neural MMO environment. The approach improves balancing efficiency versus plain PCGRL, enables insight into which tiles most influence balance, and yields playable, diverse levels with high validity. The method has potential practical impact for game design and could be extended to other domains where environmental balance is critical, such as adaptive level design or city planning scenarios.

Abstract

The balancing process for game levels in a competitive two-player context involves a lot of manual work and testing, particularly in non-symmetrical game levels. In this paper, we propose an architecture for automated balancing of tile-based levels within the recently introduced PCGRL framework (procedural content generation via reinforcement learning). Our architecture is divided into three parts: (1) a level generator, (2) a balancing agent and, (3) a reward modeling simulation. By playing the level in a simulation repeatedly, the balancing agent is rewarded for modifying it towards the same win rates for all players. To this end, we introduce a novel family of swap-based representations to increase robustness towards playability. We show that this approach is capable to teach an agent how to alter a level for balancing better and faster than plain PCGRL. In addition, by analyzing the agent's swapping behavior, we can draw conclusions about which tile types influence the balancing most. We test and show our results using the Neural MMO (NMMO) environment in a competitive two-player setting.

Balancing of competitive two-player Game Levels with Reinforcement Learning

TL;DR

This work addresses the challenge of balancing competitive two-player tile-based game levels by introducing a domain-independent architecture within PCGRL that separates level generation from balancing. It employs swap-based representations (Swap-Narrow, Swap-Turtle, Swap-Wide) and a reward design driven by simulated balancing outcomes to modify given levels toward equal win rates, demonstrated in the Neural MMO environment. The approach improves balancing efficiency versus plain PCGRL, enables insight into which tiles most influence balance, and yields playable, diverse levels with high validity. The method has potential practical impact for game design and could be extended to other domains where environmental balance is critical, such as adaptive level design or city planning scenarios.

Abstract

The balancing process for game levels in a competitive two-player context involves a lot of manual work and testing, particularly in non-symmetrical game levels. In this paper, we propose an architecture for automated balancing of tile-based levels within the recently introduced PCGRL framework (procedural content generation via reinforcement learning). Our architecture is divided into three parts: (1) a level generator, (2) a balancing agent and, (3) a reward modeling simulation. By playing the level in a simulation repeatedly, the balancing agent is rewarded for modifying it towards the same win rates for all players. To this end, we introduce a novel family of swap-based representations to increase robustness towards playability. We show that this approach is capable to teach an agent how to alter a level for balancing better and faster than plain PCGRL. In addition, by analyzing the agent's swapping behavior, we can draw conclusions about which tile types influence the balancing most. We test and show our results using the Neural MMO (NMMO) environment in a competitive two-player setting.
Paper Structure (25 sections, 1 equation, 8 figures, 1 table)

This paper contains 25 sections, 1 equation, 8 figures, 1 table.

Figures (8)

  • Figure 1: In our game environment, two players must forage for resources like food (dark green) and water (blue) to survive longest. By swapping the highlighted tiles, the trained model reached a more balanced game in simulated game runs.
  • Figure 2: Description of the NMMO tiles.
  • Figure 3: Description of the balancing architecture. It is separated into three units: A level generator, a level balancing agent and a level playing simulation. In the latter, the game is simulated by playing it $n$-times with player agents. The reward $r_t$ for training the balancing agent is calculated out of the balancing states $b_t$ and $b_{t-1}$ from the simulations.
  • Figure 4: How many times $n$ should we run the simulation to approximate the balancing state? We figure that out by calculating the mean deviation $\mu_n$ of win rates from $n-2$ to $n$ for the investigation of fluctuations in win rates.
  • Figure 5: Distribution of the initial balancing states in the generated data set of 1000 levels. We use this data set to compare the different representations.
  • ...and 3 more figures