Graph Q-Learning for Combinatorial Optimization

Victoria M. Dax; Jiachen Li; Kevin Leahy; Mykel J. Kochenderfer

Graph Q-Learning for Combinatorial Optimization

Victoria M. Dax, Jiachen Li, Kevin Leahy, Mykel J. Kochenderfer

TL;DR

The paper tackles the intractable nature of combinatorial optimization by framing CO as a sequential decision process and training a heterogeneous graph neural network to estimate $Q(s,a)$ for operation‑to‑machine assignments. Focusing on flexible Job Shop Scheduling (FJSP), it demonstrates that a Graph Neural Network trained with $Q$-Learning can construct high‑quality solutions with far fewer parameters and shorter training times than traditional heuristics. The results show competitive optimality gaps and favorable runtime scaling, with evidence of meta‑learning capabilities that generalize to larger problem sizes. This approach suggests a scalable, generalizable path toward learned solvers for CO that can complement or rival state‑of‑the‑art heuristic methods in practice.

Abstract

Graph-structured data is ubiquitous throughout natural and social sciences, and Graph Neural Networks (GNNs) have recently been shown to be effective at solving prediction and inference problems on graph data. In this paper, we propose and demonstrate that GNNs can be applied to solve Combinatorial Optimization (CO) problems. CO concerns optimizing a function over a discrete solution space that is often intractably large. To learn to solve CO problems, we formulate the optimization process as a sequential decision making problem, where the return is related to how close the candidate solution is to optimality. We use a GNN to learn a policy to iteratively build increasingly promising candidate solutions. We present preliminary evidence that GNNs trained through Q-Learning can solve CO problems with performance approaching state-of-the-art heuristic-based solvers, using only a fraction of the parameters and training time.

Graph Q-Learning for Combinatorial Optimization

TL;DR

The paper tackles the intractable nature of combinatorial optimization by framing CO as a sequential decision process and training a heterogeneous graph neural network to estimate

for operation‑to‑machine assignments. Focusing on flexible Job Shop Scheduling (FJSP), it demonstrates that a Graph Neural Network trained with

-Learning can construct high‑quality solutions with far fewer parameters and shorter training times than traditional heuristics. The results show competitive optimality gaps and favorable runtime scaling, with evidence of meta‑learning capabilities that generalize to larger problem sizes. This approach suggests a scalable, generalizable path toward learned solvers for CO that can complement or rival state‑of‑the‑art heuristic methods in practice.

Abstract

Paper Structure (12 sections, 3 equations, 4 figures, 1 table)

This paper contains 12 sections, 3 equations, 4 figures, 1 table.

Introduction
Literature Review
Preliminaries
Method
Problem Definition
Combinatorial Optimization as MDPs
Heterogeneous Graph Neural Networks
Experimental Results
Implementation details
Baselines
Results
Conclusion

Figures (4)

Figure 1: Training cycle.
Figure 2: Graphical representation of a sample FJSP instance at $t=0$ (a) and after 5 actions have been taken, $s_{t=5}$ (b).
Figure 3: Training performance summary ($25\times15$).
Figure 4: Runtimes per sample for different FJSP sizes.

Graph Q-Learning for Combinatorial Optimization

TL;DR

Abstract

Graph Q-Learning for Combinatorial Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (4)