Table of Contents
Fetching ...

Solving the flexible job-shop scheduling problem through an enhanced deep reinforcement learning approach

Imanol Echeverria, Maialen Murua, Roberto Santana

TL;DR

This work tackles the flexible job-shop scheduling problem (FJSSP) under real-time disruption by formulating FJSSP as a Markov Decision Process (MDP) and solving it with a heterogeneous graph neural network (HGNN)–driven policy trained via Proximal Policy Optimization (PPO). It introduces two key enhancements: dispatching-rule (DR) based action masking to prune the action space and a Diverse Scheduling Policies (DSSP) pipeline that uses Bayesian optimization (BO) and KNN to generate and select diverse policies for parallel inference. Empirical results on two public benchmarks show that the proposed Enhanced Diverse Scheduling Policies (EDSP) approach outperforms traditional dispatching rules and three state-of-the-art DRL methods, with particularly large gains on big instances, and even competitive performance versus OR-Tools on large problems. The work advances real-time FJSSP solving by combining a compact, informative MDP representation with efficient policy generation and robust diversification, enabling scalable and practical decision-making in dynamic manufacturing settings.

Abstract

In scheduling problems common in the industry and various real-world scenarios, responding in real-time to disruptive events is essential. Recent methods propose the use of deep reinforcement learning (DRL) to learn policies capable of generating solutions under this constraint. The objective of this paper is to introduce a new DRL method for solving the flexible job-shop scheduling problem, particularly for large instances. The approach is based on the use of heterogeneous graph neural networks to a more informative graph representation of the problem. This novel modeling of the problem enhances the policy's ability to capture state information and improve its decision-making capacity. Additionally, we introduce two novel approaches to enhance the performance of the DRL approach: the first involves generating a diverse set of scheduling policies, while the second combines DRL with dispatching rules (DRs) constraining the action space. Experimental results on two public benchmarks show that our approach outperforms DRs and achieves superior results compared to three state-of-the-art DRL methods, particularly for large instances.

Solving the flexible job-shop scheduling problem through an enhanced deep reinforcement learning approach

TL;DR

This work tackles the flexible job-shop scheduling problem (FJSSP) under real-time disruption by formulating FJSSP as a Markov Decision Process (MDP) and solving it with a heterogeneous graph neural network (HGNN)–driven policy trained via Proximal Policy Optimization (PPO). It introduces two key enhancements: dispatching-rule (DR) based action masking to prune the action space and a Diverse Scheduling Policies (DSSP) pipeline that uses Bayesian optimization (BO) and KNN to generate and select diverse policies for parallel inference. Empirical results on two public benchmarks show that the proposed Enhanced Diverse Scheduling Policies (EDSP) approach outperforms traditional dispatching rules and three state-of-the-art DRL methods, with particularly large gains on big instances, and even competitive performance versus OR-Tools on large problems. The work advances real-time FJSSP solving by combining a compact, informative MDP representation with efficient policy generation and robust diversification, enabling scalable and practical decision-making in dynamic manufacturing settings.

Abstract

In scheduling problems common in the industry and various real-world scenarios, responding in real-time to disruptive events is essential. Recent methods propose the use of deep reinforcement learning (DRL) to learn policies capable of generating solutions under this constraint. The objective of this paper is to introduce a new DRL method for solving the flexible job-shop scheduling problem, particularly for large instances. The approach is based on the use of heterogeneous graph neural networks to a more informative graph representation of the problem. This novel modeling of the problem enhances the policy's ability to capture state information and improve its decision-making capacity. Additionally, we introduce two novel approaches to enhance the performance of the DRL approach: the first involves generating a diverse set of scheduling policies, while the second combines DRL with dispatching rules (DRs) constraining the action space. Experimental results on two public benchmarks show that our approach outperforms DRs and achieves superior results compared to three state-of-the-art DRL methods, particularly for large instances.
Paper Structure (16 sections, 13 equations, 5 figures, 5 tables, 1 algorithm)

This paper contains 16 sections, 13 equations, 5 figures, 5 tables, 1 algorithm.

Figures (5)

  • Figure 1: Diagram summarizing the different components of the proposed method.
  • Figure 2: In the top figure, a simple example of a FJSSP instance is represented using our proposed representation. In the bottom figure, $o_1$ has been assigned to $m_1$ and a node and some edges of the graph have been removed.
  • Figure 3: The network architecture of our approach, EDSP.
  • Figure 4: The gap of two instances of the validation set of all the generated candidate policies is shown, colored by the cluster they belong to. All the policies have been divided into 6 clusters. It can be observed that the blue and purple clusters achieve the best results for these two instances.
  • Figure 5: Comparison of DRL-based methods on the vdata and Behnke benchmarks using different strategies.