A Configurable Library for Generating and Manipulating Maze Datasets
Michael Igorevich Ivanitskiy, Rusheb Shah, Alex F. Spies, Tilman Räuker, Dan Valentine, Can Rager, Lucia Quirke, Chris Mathwin, Guillaume Corlouer, Cecilia Diniz Behn, Samy Wu Fung
TL;DR
The paper tackles the challenge of evaluating model robustness to distributional shifts by introducing maze-generated tasks as a controllable testbed. It presents maze-dataset, a configurable Python library that can generate, filter, and export mazes in formats suitable for CNNs and autoregressive models, with explicit metadata and PyTorch integration. Key contributions include multiple maze-generation algorithms (e.g., gen_dfs, gen_wilson, gen_percolation, gen_dfs_percolation), flexible filtering, reversible output formats (ASCII, raster, token sequences), and train/evaluate utilities plus generation-speed benchmarks. The work enables reproducible investigations into out-of-distribution generalization and interpretability, and outlines future enhancements such as additional algorithms and higher-dimensional support for broader ML research applications.
Abstract
Understanding how machine learning models respond to distributional shifts is a key research challenge. Mazes serve as an excellent testbed due to varied generation algorithms offering a nuanced platform to simulate both subtle and pronounced distributional shifts. To enable systematic investigations of model behavior on out-of-distribution data, we present $\texttt{maze-dataset}$, a comprehensive library for generating, processing, and visualizing datasets consisting of maze-solving tasks. With this library, researchers can easily create datasets, having extensive control over the generation algorithm used, the parameters fed to the algorithm of choice, and the filters that generated mazes must satisfy. Furthermore, it supports multiple output formats, including rasterized and text-based, catering to convolutional neural networks and autoregressive transformer models. These formats, along with tools for visualizing and converting between them, ensure versatility and adaptability in research applications.
