Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Maxime Chevalier-Boisvert; Bolun Dai; Mark Towers; Rodrigo de Lazcano; Lucas Willems; Salem Lahlou; Suman Pal; Pablo Samuel Castro; Jordan Terry

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Maxime Chevalier-Boisvert, Bolun Dai, Mark Towers, Rodrigo de Lazcano, Lucas Willems, Salem Lahlou, Suman Pal, Pablo Samuel Castro, Jordan Terry

TL;DR

The paper presents Minigrid and Miniworld as lightweight, modular RL environments for goal-oriented tasks, emphasizing a minimal yet extensible API and ease of use. It details the design philosophy, environment specifications, and the unified API that enables cross-environment transfer learning between 2D and 3D observation spaces, demonstrated through PPO-based transfers and human-subject experiments. Case studies illustrate practical benefits and provide implementation guidance, wrappers, and code-effort estimates, highlighting the workflow's reproducibility. The work positions these libraries as accessible platforms that facilitate rapid research while acknowledging limitations like simple environment types and Python performance, with future directions including human-in-the-loop decision-making.

Abstract

We present the Minigrid and Miniworld libraries which provide a suite of goal-oriented 2D and 3D environments. The libraries were explicitly created with a minimalistic design paradigm to allow users to rapidly develop new environments for a wide range of research-specific needs. As a result, both have received widescale adoption by the RL community, facilitating research in a wide range of areas. In this paper, we outline the design philosophy, environment details, and their world generation API. We also showcase the additional capabilities brought by the unified API between Minigrid and Miniworld through case studies on transfer learning (for both RL agents and humans) between the different observation spaces. The source code of Minigrid and Miniworld can be found at https://github.com/Farama-Foundation/{Minigrid, Miniworld} along with their documentation at https://{minigrid, miniworld}.farama.org/.

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

TL;DR

Abstract

Paper Structure (21 sections, 1 equation, 5 figures, 2 tables)

This paper contains 21 sections, 1 equation, 5 figures, 2 tables.

Introduction
Minigrid & Miniworld Libraries
Design Philosophy
Minigrid Environments
Miniworld Environments
Constructing and Extending Environments
Adoption
Case Studies for Utilizing the Unified API
RL Agent Transfer Learning Between Different Observations Spaces
Transfer Learning Between Different Observations Spaces for 10 Human Subjects
Implementation Details
Related Works
Conclusion
Dataset Documentation & URL
Implementation Details for Transfer Learning Between Different Observations Spaces for 10 Human Subjects
...and 6 more sections

Figures (5)

Figure 1: Example environments from Minigrid and Miniworld.
Figure 2: Example Minigrid environments with their mission instruction. For each of the environments, the highlighted region indicates the partial observation received by the agent.
Figure 3: Visualization of the miniworld-gotoobj-env (left) and minigrid-gotoobj-env (right). The miniworld-gotoobj-env image shows both the top-down view and the agent view (top-right window). During training the agent only has access to the agent view of the environment.
Figure 4: Trajectories from one human subject when testing transferring experience on Minigrid environments to Miniworld. The numbers correspond to the episode number.
Figure 5: GitHub Stars evolution for Minigrid and Miniworld (recorded on June 12th, 2023, data obtained using https://star-history.com)

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

TL;DR

Abstract

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks

Authors

TL;DR

Abstract

Table of Contents

Figures (5)