Table of Contents
Fetching ...

Attention-based Reinforcement Learning for Combinatorial Optimization: Application to Job Shop Scheduling Problem

Jaejin Lee, Seho Kee, Mani Janakiram, George Runger

TL;DR

Empirical evidence demonstrates that the approach surpasses the results of recent studies and outperforms commonly implemented heuristic rules, and suggests that the method offers a promising avenue for future research and practical application in the field of job shop scheduling problems.

Abstract

Job shop scheduling problems represent a significant and complex facet of combinatorial optimization problems, which have traditionally been addressed through either exact or approximate solution methodologies. However, the practical application of these solutions is often challenged due to the complexity of real-world problems. Even when utilizing an approximate solution approach, the time required to identify a near-optimal solution can be prohibitively extensive, and the solutions derived are generally not applicable to new problems. This study proposes an innovative attention-based reinforcement learning method specifically designed for the category of job shop scheduling problems. This method integrates a policy gradient reinforcement learning approach with a modified transformer architecture. A key finding of this research is the ability of our trained learners within the proposed method to be repurposed for larger-scale problems that were not part of the initial training set. Furthermore, empirical evidence demonstrates that our approach surpasses the results of recent studies and outperforms commonly implemented heuristic rules. This suggests that our method offers a promising avenue for future research and practical application in the field of job shop scheduling problems.

Attention-based Reinforcement Learning for Combinatorial Optimization: Application to Job Shop Scheduling Problem

TL;DR

Empirical evidence demonstrates that the approach surpasses the results of recent studies and outperforms commonly implemented heuristic rules, and suggests that the method offers a promising avenue for future research and practical application in the field of job shop scheduling problems.

Abstract

Job shop scheduling problems represent a significant and complex facet of combinatorial optimization problems, which have traditionally been addressed through either exact or approximate solution methodologies. However, the practical application of these solutions is often challenged due to the complexity of real-world problems. Even when utilizing an approximate solution approach, the time required to identify a near-optimal solution can be prohibitively extensive, and the solutions derived are generally not applicable to new problems. This study proposes an innovative attention-based reinforcement learning method specifically designed for the category of job shop scheduling problems. This method integrates a policy gradient reinforcement learning approach with a modified transformer architecture. A key finding of this research is the ability of our trained learners within the proposed method to be repurposed for larger-scale problems that were not part of the initial training set. Furthermore, empirical evidence demonstrates that our approach surpasses the results of recent studies and outperforms commonly implemented heuristic rules. This suggests that our method offers a promising avenue for future research and practical application in the field of job shop scheduling problems.
Paper Structure (10 sections, 7 equations, 2 figures, 2 tables)

This paper contains 10 sections, 7 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Generalized Performance of our ARLS model compared with heuristic rules and other studies on benchmark datasets, ABZ, FT, YN, SWVM, and ORB. Results by zhang2020learning denoted as Z20, park2021schedulenet as P21a, park2021learning as P21b and yuan2023solving and Y23, and popularly adopted heuristic rules, such as first-in-first-out (FIFO), shortest processing time (SPT), and most work remaining (MWKR).
  • Figure 2: Generalized Performance of our ARLS model compared with heuristic rules and other studies on benchmark datasets, TA and DMU. Results by zhang2020learning denoted as Z20, chen2022deep as C22, and yuan2023solving and Y23, and popularly adopted heuristic rules, such as first-in-first-out (FIFO), shortest processing time (SPT), and most work remaining (MWKR).