Table of Contents
Fetching ...

schlably: A Python Framework for Deep Reinforcement Learning Based Scheduling Experiments

Constantin Waubert de Puiseau, Jannik Peters, Christian Dörpelkus, Hasan Tercan, Tobias Meisen

TL;DR

Schlably addresses fragmentation in DRL-based production scheduling research by providing a modular Python framework that unifies environments, data generation, agents, and evaluation. It ships with pre-implemented benchmarks, flexible scheduling problem generators (including resource constraints), and logging/visualization to improve reproducibility. The framework adheres to OpenAI Gym conventions and integrates with RL libraries, enabling out-of-the-box experimentation and cross-comparison using tools like Weights & Biases. This work promises to accelerate DRL-PS research and facilitate transfer of methods to real-world scheduling problems.

Abstract

Research on deep reinforcement learning (DRL) based production scheduling (PS) has gained a lot of attention in recent years, primarily due to the high demand for optimizing scheduling problems in diverse industry settings. Numerous studies are carried out and published as stand-alone experiments that often vary only slightly with respect to problem setups and solution approaches. The programmatic core of these experiments is typically very similar. Despite this fact, no standardized and resilient framework for experimentation on PS problems with DRL algorithms could be established so far. In this paper, we introduce schlably, a Python-based framework that provides researchers a comprehensive toolset to facilitate the development of PS solution strategies based on DRL. schlably eliminates the redundant overhead work that the creation of a sturdy and flexible backbone requires and increases the comparability and reusability of conducted research work.

schlably: A Python Framework for Deep Reinforcement Learning Based Scheduling Experiments

TL;DR

Schlably addresses fragmentation in DRL-based production scheduling research by providing a modular Python framework that unifies environments, data generation, agents, and evaluation. It ships with pre-implemented benchmarks, flexible scheduling problem generators (including resource constraints), and logging/visualization to improve reproducibility. The framework adheres to OpenAI Gym conventions and integrates with RL libraries, enabling out-of-the-box experimentation and cross-comparison using tools like Weights & Biases. This work promises to accelerate DRL-PS research and facilitate transfer of methods to real-world scheduling problems.

Abstract

Research on deep reinforcement learning (DRL) based production scheduling (PS) has gained a lot of attention in recent years, primarily due to the high demand for optimizing scheduling problems in diverse industry settings. Numerous studies are carried out and published as stand-alone experiments that often vary only slightly with respect to problem setups and solution approaches. The programmatic core of these experiments is typically very similar. Despite this fact, no standardized and resilient framework for experimentation on PS problems with DRL algorithms could be established so far. In this paper, we introduce schlably, a Python-based framework that provides researchers a comprehensive toolset to facilitate the development of PS solution strategies based on DRL. schlably eliminates the redundant overhead work that the creation of a sturdy and flexible backbone requires and increases the comparability and reusability of conducted research work.
Paper Structure (18 sections, 2 figures, 3 tables)

This paper contains 18 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Overview of schlably project and code structure.
  • Figure 2: Comparing agent runs in Weights&Biases (screenshot from the web interface shown on the left-hand side). a) Visualized training curves for interpreting the learning performance of the agent. b) Gantt chart depicting the solution of the trained agent on a selected test instance. c) Table providing evaluation results and comparison of the trained agents and benchmark methods on the test instances.