Learning Combinatorial Optimization Algorithms over Graphs
Hanjun Dai, Elias B. Khalil, Yuyu Zhang, Bistra Dilkina, Le Song
TL;DR
Addresses automation of designing greedy heuristics for NP-hard graph optimization problems by learning a common Q-function over a graph-structured state. It combines Structure2Vec graph embeddings with n-step Q-learning to produce a greedy meta-algorithm applicable to MVC, MAXCUT, and TSP, generalizing to larger graphs. Experimental results show strong solution quality, favorable time–quality trade-offs, and discovery of nontrivial heuristics, including strategies that balance degree and connectivity. This framework promises scalable, data-driven algorithm design for recurring graph-structured optimization problems.
Abstract
The design of good heuristics or approximation algorithms for NP-hard combinatorial optimization problems often requires significant specialized knowledge and trial-and-error. Can we automate this challenging, tedious process, and learn the algorithms instead? In many real-world applications, it is typically the case that the same optimization problem is solved again and again on a regular basis, maintaining the same problem structure but differing in the data. This provides an opportunity for learning heuristic algorithms that exploit the structure of such recurring problems. In this paper, we propose a unique combination of reinforcement learning and graph embedding to address this challenge. The learned greedy policy behaves like a meta-algorithm that incrementally constructs a solution, and the action is determined by the output of a graph embedding network capturing the current state of the solution. We show that our framework can be applied to a diverse range of optimization problems over graphs, and learns effective algorithms for the Minimum Vertex Cover, Maximum Cut and Traveling Salesman problems.
