Improving Conditional Level Generation using Automated Validation in Match-3 Games
Monica Villanueva Aylagas, Joakim Bergdahl, Jonas Gillberg, Alessandro Sestini, Theodor Tolstoy, Linus Gisslén
TL;DR
This paper tackles the challenge of unreliable validity and limited user control in PCGML-based level generation by introducing Avalon, a framework that leverages automated gameplay validation during training to condition a generator on difficulty statistics. Implemented as a conditional variational autoencoder, Avalon conditions the generation on factors such as the median number of moves to solve, board size, and symmetry, and employs a partial-generation masking strategy to enforce structural constraints. Empirical results in a simplified match-3 setting show that difficulty conditioning improves playability, increasing valid levels from 43.75% to 51.39%, while incurring modest declines in size, diversity, and tile-distribution fidelity. The approach demonstrates practical potential for producing valid, stylized levels with controllable difficulty, and points to future work in richer validation signals, multi-layer level representations, and broader game genres.
Abstract
Generative models for level generation have shown great potential in game production. However, they often provide limited control over the generation, and the validity of the generated levels is unreliable. Despite this fact, only a few approaches that learn from existing data provide the users with ways of controlling the generation, simultaneously addressing the generation of unsolvable levels. %One of the main challenges it faces is that levels generated through automation may not be solvable thus requiring validation. are not always engaging, challenging, or even solvable. This paper proposes Avalon, a novel method to improve models that learn from existing level designs using difficulty statistics extracted from gameplay. In particular, we use a conditional variational autoencoder to generate layouts for match-3 levels, conditioning the model on pre-collected statistics such as game mechanics like difficulty and relevant visual features like size and symmetry. Our method is general enough that multiple approaches could potentially be used to generate these statistics. We quantitatively evaluate our approach by comparing it to an ablated model without difficulty conditioning. Additionally, we analyze both quantitatively and qualitatively whether the style of the dataset is preserved in the generated levels. Our approach generates more valid levels than the same method without difficulty conditioning.
