A Configurable Library for Generating and Manipulating Maze Datasets

Michael Igorevich Ivanitskiy; Rusheb Shah; Alex F. Spies; Tilman Räuker; Dan Valentine; Can Rager; Lucia Quirke; Chris Mathwin; Guillaume Corlouer; Cecilia Diniz Behn; Samy Wu Fung

A Configurable Library for Generating and Manipulating Maze Datasets

Michael Igorevich Ivanitskiy, Rusheb Shah, Alex F. Spies, Tilman Räuker, Dan Valentine, Can Rager, Lucia Quirke, Chris Mathwin, Guillaume Corlouer, Cecilia Diniz Behn, Samy Wu Fung

TL;DR

The paper tackles the challenge of evaluating model robustness to distributional shifts by introducing maze-generated tasks as a controllable testbed. It presents maze-dataset, a configurable Python library that can generate, filter, and export mazes in formats suitable for CNNs and autoregressive models, with explicit metadata and PyTorch integration. Key contributions include multiple maze-generation algorithms (e.g., gen_dfs, gen_wilson, gen_percolation, gen_dfs_percolation), flexible filtering, reversible output formats (ASCII, raster, token sequences), and train/evaluate utilities plus generation-speed benchmarks. The work enables reproducible investigations into out-of-distribution generalization and interpretability, and outlines future enhancements such as additional algorithms and higher-dimensional support for broader ML research applications.

Abstract

Understanding how machine learning models respond to distributional shifts is a key research challenge. Mazes serve as an excellent testbed due to varied generation algorithms offering a nuanced platform to simulate both subtle and pronounced distributional shifts. To enable systematic investigations of model behavior on out-of-distribution data, we present $\texttt{maze-dataset}$, a comprehensive library for generating, processing, and visualizing datasets consisting of maze-solving tasks. With this library, researchers can easily create datasets, having extensive control over the generation algorithm used, the parameters fed to the algorithm of choice, and the filters that generated mazes must satisfy. Furthermore, it supports multiple output formats, including rasterized and text-based, catering to convolutional neural networks and autoregressive transformer models. These formats, along with tools for visualizing and converting between them, ensure versatility and adaptability in research applications.

A Configurable Library for Generating and Manipulating Maze Datasets

TL;DR

Abstract

, a comprehensive library for generating, processing, and visualizing datasets consisting of maze-solving tasks. With this library, researchers can easily create datasets, having extensive control over the generation algorithm used, the parameters fed to the algorithm of choice, and the filters that generated mazes must satisfy. Furthermore, it supports multiple output formats, including rasterized and text-based, catering to convolutional neural networks and autoregressive transformer models. These formats, along with tools for visualizing and converting between them, ensure versatility and adaptability in research applications.

Paper Structure (11 sections, 7 figures, 1 table)

This paper contains 11 sections, 7 figures, 1 table.

Introduction
Maze Generation and Usage
Output Formats
Training and Evaluation
Benchmarks of Generation Speed
Implementation
Relation to Existing Works
Limitations of maze-dataset
Conclusion
Acknowledgements
Appendix: Examples of Generated Mazes

Figures (7)

Figure 1: Example mazes from various algorithms. Left to right: randomized depth-first search (RDFS), RDFS without forks, constrained RDFS, Wilson's wilson, RDFS with percolation ($p=0.1$), RDFS with percolation ($p=0.4$), random stack RDFS. Further examples available in the appendix of this work (Section \ref{['appendix']}).
Figure 2: Various output formats. Top row (left to right): ASCII diagram, rasterized pixel grid, and advanced display. Bottom row: text format for autoregressive networks.
Figure 3: Input is the rasterized maze without the path marked (left), and provide as a target the maze with all but the correct path removed. Configuration options exist to adjust whether endpoints are included and if empty cells should be filled in.
Figure 4: Left: maze prompt up to <PATH_START>. Right: relative ordering of the cells in the vocabulary. Note that the top-left square of size $n \times n$ can be described using only the first $n^2$ tokens in the vocabulary.
Figure 5: Plots of maze generation time. Generation time scales exponentially with maze size for all algorithms (left). Generation time does not depend on the number of mazes being generated, and there is minimal overhead to initializing the generation process for a small dataset (right). Wilson's algorithm is notably less efficient than others and has high variance. Note that for both plots, values are averaged across all parameter sets for that algorithm, and parallelization is disabled.
...and 2 more figures

A Configurable Library for Generating and Manipulating Maze Datasets

TL;DR

Abstract

A Configurable Library for Generating and Manipulating Maze Datasets

Authors

TL;DR

Abstract

Table of Contents

Figures (7)