Analysis of Robustness of a Large Game Corpus
Mahsa Bazzaz, Seth Cooper
TL;DR
This work tackles the fragility of highly structured discrete game level data by introducing a formal data-robustness metric and a large, diverse corpus called the Generated Game Level Corpus (GGLC). The authors adapt robustness concepts to data, define discrete and continuous forms of non-robustness, and generate thousands of solvable and unsolvable levels across four tile-based games using a constraint-based generator (Sturgeon). Their analyses reveal substantial sensitivity to small input changes, with varying degrees across game types, and employ embedding-based methods (CLIP+UMAP) to compare robustness against standard benchmarks. The GGLC serves as a scalable resource to study PCGML under hard constraints, bridging communities working with structured data and providing a foundation for robust content generation research, with the dataset released under CC-BY 4.0.
Abstract
Procedural content generation via machine learning (PCGML) in games involves using machine learning techniques to create game content such as maps and levels. 2D tile-based game levels have consistently served as a standard dataset for PCGML because they are a simplified version of game levels while maintaining the specific constraints typical of games, such as being solvable. In this work, we highlight the unique characteristics of game levels, including their structured discrete data nature, the local and global constraints inherent in the games, and the sensitivity of the game levels to small changes in input. We define the robustness of data as a measure of sensitivity to small changes in input that cause a change in output, and we use this measure to analyze and compare these levels to state-of-the-art machine learning datasets, showcasing the subtle differences in their nature. We also constructed a large dataset from four games inspired by popular classic tile-based games that showcase these characteristics and address the challenge of sparse data in PCGML by providing a significantly larger dataset than those currently available.
