Graph Q-Learning for Combinatorial Optimization
Victoria M. Dax, Jiachen Li, Kevin Leahy, Mykel J. Kochenderfer
TL;DR
The paper tackles the intractable nature of combinatorial optimization by framing CO as a sequential decision process and training a heterogeneous graph neural network to estimate $Q(s,a)$ for operation‑to‑machine assignments. Focusing on flexible Job Shop Scheduling (FJSP), it demonstrates that a Graph Neural Network trained with $Q$-Learning can construct high‑quality solutions with far fewer parameters and shorter training times than traditional heuristics. The results show competitive optimality gaps and favorable runtime scaling, with evidence of meta‑learning capabilities that generalize to larger problem sizes. This approach suggests a scalable, generalizable path toward learned solvers for CO that can complement or rival state‑of‑the‑art heuristic methods in practice.
Abstract
Graph-structured data is ubiquitous throughout natural and social sciences, and Graph Neural Networks (GNNs) have recently been shown to be effective at solving prediction and inference problems on graph data. In this paper, we propose and demonstrate that GNNs can be applied to solve Combinatorial Optimization (CO) problems. CO concerns optimizing a function over a discrete solution space that is often intractably large. To learn to solve CO problems, we formulate the optimization process as a sequential decision making problem, where the return is related to how close the candidate solution is to optimality. We use a GNN to learn a policy to iteratively build increasingly promising candidate solutions. We present preliminary evidence that GNNs trained through Q-Learning can solve CO problems with performance approaching state-of-the-art heuristic-based solvers, using only a fraction of the parameters and training time.
