SDGym: Low-Code Reinforcement Learning Environments using System Dynamics Models

Emmanuel Klu; Sameer Sethi; DJ Passey; Donald Martin

SDGym: Low-Code Reinforcement Learning Environments using System Dynamics Models

Emmanuel Klu, Sameer Sethi, DJ Passey, Donald Martin

TL;DR

This work introduces SDGym, a low-code library that converts System Dynamics (SD) models into Reinforcement Learning (RL) environments within OpenAI Gym. It demonstrates feasibility with an electric vehicle adoption SD model, compares PySD and BPTK-Py simulators for parity, and trains a D4PG agent to learn dynamic interventions, illustrating the potential for richer, policy-relevant RL environments. The paper outlines design decisions to reconcile SD and RL paradigms, discusses extensibility to fairness and multi-agent RL, and open-sources the toolkit to accelerate interdisciplinary collaboration. The approach promises to improve environment realism and facilitate faster discovery of robust interventions in complex societal systems, enabling researchers from both SD and RL communities to collaborate more effectively.

Abstract

Understanding the long-term impact of algorithmic interventions on society is vital to achieving responsible AI. Traditional evaluation strategies often fall short due to the complex, adaptive and dynamic nature of society. While reinforcement learning (RL) can be a powerful approach for optimizing decisions in dynamic settings, the difficulty of realistic environment design remains a barrier to building robust agents that perform well in practical settings. To address this issue we tap into the field of system dynamics (SD) as a complementary method that incorporates collaborative simulation model specification practices. We introduce SDGym, a low-code library built on the OpenAI Gym framework which enables the generation of custom RL environments based on SD simulation models. Through a feasibility study we validate that well specified, rich RL environments can be generated from preexisting SD models and a few lines of configuration code. We demonstrate the capabilities of the SDGym environment using an SD model of the electric vehicle adoption problem. We compare two SD simulators, PySD and BPTK-Py for parity, and train a D4PG agent using the Acme framework to showcase learning and environment interaction. Our preliminary findings underscore the dual potential of SD to improve RL environment design and for RL to improve dynamic policy discovery within SD models. By open-sourcing SDGym, the intent is to galvanize further research and promote adoption across the SD and RL communities, thereby catalyzing collaboration in this emerging interdisciplinary space.

SDGym: Low-Code Reinforcement Learning Environments using System Dynamics Models

TL;DR

Abstract

SDGym: Low-Code Reinforcement Learning Environments using System Dynamics Models

Authors

TL;DR

Abstract

Table of Contents

Figures (5)