pyRDDLGym: From RDDL to Gym Environments

Ayal Taitler; Michael Gimelfarb; Jihwan Jeong; Sriram Gopalakrishnan; Martin Mladenov; Xiaotian Liu; Scott Sanner

pyRDDLGym: From RDDL to Gym Environments

Ayal Taitler, Michael Gimelfarb, Jihwan Jeong, Sriram Gopalakrishnan, Martin Mladenov, Xiaotian Liu, Scott Sanner

TL;DR

pyRDDLGym presents an open-source framework that automatically generates OpenAI Gym environments from RDDL descriptions, bridging reinforcement learning and planning with a scalable, model-explicit approach. The platform extends RDDL with terminal states, observations, and lifted-to-grounded grounding, while maintaining Gym-compatible interfaces and support for both MDPs and POMDPs. It provides built-in environments, an example manager, and auxiliary tools (XADDs, DBN visualization), plus a model-based planner (JAXPlanner) for differentiable planning over CPFs and relaxed logic. By decoupling problem design from environment implementation, pyRDDLGym enables rapid benchmark creation, verified modeling, and hybrid learning frameworks that leverage explicit problem structure. The work aims to foster collaboration between RL and planning communities and offers practical pathways to scalable, model-aware benchmarks.

Abstract

We present pyRDDLGym, a Python framework for auto-generation of OpenAI Gym environments from RDDL declerative description. The discrete time step evolution of variables in RDDL is described by conditional probability functions, which fits naturally into the Gym step scheme. Furthermore, since RDDL is a lifted description, the modification and scaling up of environments to support multiple entities and different configurations becomes trivial rather than a tedious process prone to errors. We hope that pyRDDLGym will serve as a new wind in the reinforcement learning community by enabling easy and rapid development of benchmarks due to the unique expressive power of RDDL. By providing explicit access to the model in the RDDL description, pyRDDLGym can also facilitate research on hybrid approaches for learning from interaction while leveraging model knowledge. We present the design and built-in examples of pyRDDLGym, and the additions made to the RDDL language that were incorporated into the framework.

pyRDDLGym: From RDDL to Gym Environments

TL;DR

Abstract

Paper Structure (22 sections, 13 equations, 7 figures, 1 table)

This paper contains 22 sections, 13 equations, 7 figures, 1 table.

Introduction
RDDL Support and Extensions
Language Variant
Level Reasoning
Terminal States
Design and Implementation
Instantiation of Gym Environments
Observation and Action Spaces
From Lifted to Grounded
MDPs and POMDPs
Single Time Step Evolution
Resetting the Environment
Visualization
pyRDDLGym Beyond the Engine
Built-in Environments
...and 7 more sections

Figures (7)

Figure 1: pyRDDLGym design concept.
Figure 2: pyRDDLGym code examples. A pyRDDLGym environment is characterized by an RDDL domain file and an instance & non-fluents file. (a) Fluents, CPFs, and reward for the game of life problem (b) A non-fluents block, defining a two neighboring cell topology. (c) An instance block defining the parameters and init-state of the problem. (d) Using the domain and instance files, an RDDLEnv can be initialized. Interaction is similar to that of any OpenAI Gym environment.
Figure 3: Examples of environments implemented in pyRDDLGym. From left to right: elevators, UAV, power unit commitment, Mars rover and recommender systems.
Figure 4: pyRDDLGym auxiliary tools
Figure 5: JAXPlanner modes of operations.
...and 2 more figures

pyRDDLGym: From RDDL to Gym Environments

TL;DR

Abstract

pyRDDLGym: From RDDL to Gym Environments

Authors

TL;DR

Abstract

Table of Contents

Figures (7)