Neural Combinatorial Optimization for Stochastic Flexible Job Shop Scheduling Problems
Igor G. Smit, Yaoxin Wu, Pavel Troubil, Yingqian Zhang, Wim P. M. Nuijten
TL;DR
This work introduces a Scenario Processing Module (SPM) to extend neural combinatorial optimization to stochastic flexible job shop scheduling. By embedding multiple sampled scenarios via an attention-based mechanism and integrating them into a DAN backbone (SPM-DAN), the method learns policies that optimize VaR or expected makespan under uncertainty. The approach leverages a Markov decision process with a dual state representation (deterministic and scenario-based) and a PPO training regime to achieve strong performance on synthetic and public datasets, with notable scalability and distributional robustness. The results demonstrate significant improvements over traditional heuristics and existing learning methods, highlighting practical impact for robust, fast scheduling under stochastic processing times.
Abstract
Neural combinatorial optimization (NCO) has gained significant attention due to the potential of deep learning to efficiently solve combinatorial optimization problems. NCO has been widely applied to job shop scheduling problems (JSPs) with the current focus predominantly on deterministic problems. In this paper, we propose a novel attention-based scenario processing module (SPM) to extend NCO methods for solving stochastic JSPs. Our approach explicitly incorporates stochastic information by an attention mechanism that captures the embedding of sampled scenarios (i.e., an approximation of stochasticity). Fed with the embedding, the base neural network is intervened by the attended scenarios, which accordingly learns an effective policy under stochasticity. We also propose a training paradigm that works harmoniously with either the expected makespan or Value-at-Risk objective. Results demonstrate that our approach outperforms existing learning and non-learning methods for the flexible JSP problem with stochastic processing times on a variety of instances. In addition, our approach holds significant generalizability to varied numbers of scenarios and disparate distributions.
