Differentiating Through Integer Linear Programs with Quadratic Regularization and Davis-Yin Splitting
Daniel McKenzie, Samy Wu Fung, Howard Heaton
TL;DR
This work addresses end-to-end learning for ILPs with context-dependent costs by relaxing the ILP to a quadratically regularized LP and solving via a Davis–Yin three-operator splitting scheme. It introduces DYS-Net, enabling forward passes that are scalable for large problem sizes and backward passes that use Jacobian-free backpropagation to yield informative gradients without requiring Lagrange multipliers. The authors provide theoretical conditions ensuring descent directions in training and demonstrate that the combined forward/backward approach scales to tens of thousands of variables, outperforming existing baselines on shortest path and knapsack problems, and extending to large-scale shortest-path settings. The practical impact lies in enabling efficient, differentiable optimization layers for complex combinatorial problems within neural networks, with open-source code to facilitate adoption and further research.
Abstract
In many applications, a combinatorial problem must be repeatedly solved with similar, but distinct parameters. Yet, the parameters $w$ are not directly observed; only contextual data $d$ that correlates with $w$ is available. It is tempting to use a neural network to predict $w$ given $d$. However, training such a model requires reconciling the discrete nature of combinatorial optimization with the gradient-based frameworks used to train neural networks. We study the case where the problem in question is an Integer Linear Program (ILP). We propose applying a three-operator splitting technique, also known as Davis-Yin splitting (DYS), to the quadratically regularized continuous relaxation of the ILP. We prove that the resulting scheme is compatible with the recently introduced Jacobian-free backpropagation (JFB). Our experiments on two representative ILPs: the shortest path problem and the knapsack problem, demonstrate that this combination-DYS on the forward pass, JFB on the backward pass-yields a scheme which scales more effectively to high-dimensional problems than existing schemes. All code associated with this paper is available at github.com/mines-opt-ml/fpo-dys.
