Table of Contents
Fetching ...

Self-Labeling the Job Shop Scheduling Problem

Andrea Corsini, Angelo Porrello, Simone Calderara, Mauro Dell'Amico

TL;DR

This work shows that generative models can be trained by sampling multiple solutions and using the best one according to the problem objective as a pseudo-label, and iteratively improve the model generation capability by relying only on its self-supervision, eliminating the need for optimality information.

Abstract

This work proposes a self-supervised training strategy designed for combinatorial problems. An obstacle in applying supervised paradigms to such problems is the need for costly target solutions often produced with exact solvers. Inspired by semi- and self-supervised learning, we show that generative models can be trained by sampling multiple solutions and using the best one according to the problem objective as a pseudo-label. In this way, we iteratively improve the model generation capability by relying only on its self-supervision, eliminating the need for optimality information. We validate this Self-Labeling Improvement Method (SLIM) on the Job Shop Scheduling (JSP), a complex combinatorial problem that is receiving much attention from the neural combinatorial community. We propose a generative model based on the well-known Pointer Network and train it with SLIM. Experiments on popular benchmarks demonstrate the potential of this approach as the resulting models outperform constructive heuristics and state-of-the-art learning proposals for the JSP. Lastly, we prove the robustness of SLIM to various parameters and its generality by applying it to the Traveling Salesman Problem.

Self-Labeling the Job Shop Scheduling Problem

TL;DR

This work shows that generative models can be trained by sampling multiple solutions and using the best one according to the problem objective as a pseudo-label, and iteratively improve the model generation capability by relying only on its self-supervision, eliminating the need for optimality information.

Abstract

This work proposes a self-supervised training strategy designed for combinatorial problems. An obstacle in applying supervised paradigms to such problems is the need for costly target solutions often produced with exact solvers. Inspired by semi- and self-supervised learning, we show that generative models can be trained by sampling multiple solutions and using the best one according to the problem objective as a pseudo-label. In this way, we iteratively improve the model generation capability by relying only on its self-supervision, eliminating the need for optimality information. We validate this Self-Labeling Improvement Method (SLIM) on the Job Shop Scheduling (JSP), a complex combinatorial problem that is receiving much attention from the neural combinatorial community. We propose a generative model based on the well-known Pointer Network and train it with SLIM. Experiments on popular benchmarks demonstrate the potential of this approach as the resulting models outperform constructive heuristics and state-of-the-art learning proposals for the JSP. Lastly, we prove the robustness of SLIM to various parameters and its generality by applying it to the Traveling Salesman Problem.
Paper Structure (25 sections, 3 equations, 7 figures, 8 tables)

This paper contains 25 sections, 3 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: The sequences of decisions for constructing solutions in a JSP instance with two jobs ($J_1$ and $J_2$) and two machines (identified in green and red). Best viewed in colors.
  • Figure 2: GM validation curves when trained with PPO and our SLIM in the same training setting of DispatchingJSP.
  • Figure 3: The average gaps when sampling $128$ solutions from architectures trained without and with self-labeling. CL$_{\text{UCL}}$ is the model obtained in CurriculumJSP by training with reward-to-go on random instance shapes (no curriculum learning) and CL is similarly obtained by applying curriculum learning.
  • Figure 3: The GM performance when trained by sampling varying number of solutions $\beta$. For each shape, we report the average PG on instances of both benchmarks by sampling $512$ solutions during testing.
  • Figure 4: The GM performance (trained as in \ref{['ssec:setup']}) for varying numbers of sampled solutions $\beta$ at test time. For each shape, we report the average PG on instances of both benchmarks.
  • ...and 2 more figures