Table of Contents
Fetching ...

The NetHack Learning Environment

Heinrich Küttler, Nantas Nardelli, Alexander H. Miller, Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel

TL;DR

The NetHack Learning Environment (NLE) introduces a fast, complex, procedurally generated RL testbed built on NetHack to push exploration, planning, memory, and generalization research. By combining a rich symbolic observation space, a large action set, and long-horizon dynamics with a scalable Gym interface, NLE enables diverse tasks and robust benchmarks, including baseline IMPALA and Random Network Distillation (RND) methods. Empirical results show meaningful gains from exploration strategies across several tasks, with thorough generalization and qualitative analyses revealing core failure modes and dynamics such as complex entity interactions and rare events like locating the Oracle. The work argues that NetHack’s depth, randomness, and abundance of embedded knowledge enable long-term progress toward robust, transferable RL algorithms, while remaining accessible to resource-constrained research groups. Future directions include upgrading to NetHack 3.7, scripting for user-defined tasks, and harnessing language-based signals for auxiliary learning.

Abstract

Progress in Reinforcement Learning (RL) algorithms goes hand-in-hand with the development of challenging environments that test the limits of current methods. While existing RL environments are either sufficiently complex or based on fast simulation, they are rarely both. Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging environment for RL research based on the popular single-player terminal-based roguelike game, NetHack. We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL, while dramatically reducing the computational resources required to gather a large amount of experience. We compare NLE and its task suite to existing alternatives, and discuss why it is an ideal medium for testing the robustness and systematic generalization of RL agents. We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration, alongside qualitative analysis of various agents trained in the environment. NLE is open source at https://github.com/facebookresearch/nle.

The NetHack Learning Environment

TL;DR

The NetHack Learning Environment (NLE) introduces a fast, complex, procedurally generated RL testbed built on NetHack to push exploration, planning, memory, and generalization research. By combining a rich symbolic observation space, a large action set, and long-horizon dynamics with a scalable Gym interface, NLE enables diverse tasks and robust benchmarks, including baseline IMPALA and Random Network Distillation (RND) methods. Empirical results show meaningful gains from exploration strategies across several tasks, with thorough generalization and qualitative analyses revealing core failure modes and dynamics such as complex entity interactions and rare events like locating the Oracle. The work argues that NetHack’s depth, randomness, and abundance of embedded knowledge enable long-term progress toward robust, transferable RL algorithms, while remaining accessible to resource-constrained research groups. Future directions include upgrading to NetHack 3.7, scripting for user-defined tasks, and harnessing language-based signals for auxiliary learning.

Abstract

Progress in Reinforcement Learning (RL) algorithms goes hand-in-hand with the development of challenging environments that test the limits of current methods. While existing RL environments are either sufficiently complex or based on fast simulation, they are rarely both. Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging environment for RL research based on the popular single-player terminal-based roguelike game, NetHack. We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL, while dramatically reducing the computational resources required to gather a large amount of experience. We compare NLE and its task suite to existing alternatives, and discuss why it is an ideal medium for testing the robustness and systematic generalization of RL agents. We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration, alongside qualitative analysis of various agents trained in the environment. NLE is open source at https://github.com/facebookresearch/nle.

Paper Structure

This paper contains 45 sections, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Annotated example of an agent at two different stages in NetHack (Left: a procedurally generated first level of the Dungeons of Doom, right: Gnomish Mines). A larger version of this figure is displayed in Figure \ref{['fig:levelBig']} in the appendix.
  • Figure 2: The hero (@) has to cross water (}) to get past Medusa (@, out of the hero's line of sight) down the staircase (>) to the next level.
  • Figure 3: Overview of the core architecture of the baseline models released with NLE. A larger version of this figure is displayed in Figure \ref{['fig:modelBig']} in the appendix.
  • Figure 4: Training and test performance when training on restricted sets of seeds.
  • Figure 5: Mean return of the last $100$ episodes averaged over five runs.
  • ...and 7 more figures