Preference-Based Gradient Estimation for ML-Guided Approximate Combinatorial Optimization
Arman Mielke, Uwe Bauknecht, Thilo Strauss, Mathias Niepert
TL;DR
This work tackles the challenge of improving combinatorial optimization (CO) solutions within time budgets by marrying fast, non-learned approximation algorithms with graph neural networks (GNNs) that predict input parameters. It introduces Preference-Based Gradient Estimation (PBGE), a self-supervised method to backpropagate through non-differentiable solvers by comparing solution pairs sampled from the solver and ranking them by the CO objective. The approach yields a provably effective gradient signal, scales to problems like minimum $k$-cut and TSP, and achieves strong performance on standard benchmarks without requiring ground-truth labels. Practically, PBGE enables efficient, data-driven guidance of classic CO heuristics, improving solution quality while maintaining fast runtimes and broad applicability to graph-based CO tasks.
Abstract
Combinatorial optimization (CO) problems arise across a broad spectrum of domains, including medicine, logistics, and manufacturing. While exact solutions are often computationally infeasible, many practical applications require high-quality solutions within a given time budget. To address this, we propose a learning-based approach that enhances existing non-learned approximation algorithms for CO. Specifically, we parameterize these approximation algorithms and train graph neural networks (GNNs) to predict parameter values that yield near-optimal solutions. Our method is trained end-to-end in a self-supervised fashion, using a novel gradient estimation scheme that treats the approximation algorithm as a black box. This approach combines the strengths of learning and traditional algorithms: the GNN learns from data to guide the algorithm toward better solutions, while the approximation algorithm ensures feasibility. We validate our method on two well-known combinatorial optimization problems: the travelling salesman problem (TSP) and the minimum k-cut problem. Our results demonstrate that the proposed approach is competitive with state-of-the-art learned CO solvers.
