Table of Contents
Fetching ...

Solver-Free Decision-Focused Learning for Linear Optimization Problems

Senne Berden, Ali İrfan Mahmutoğulları, Dimos Tsouros, Tias Guns

TL;DR

This paper tackles the computational bottleneck of decision-focused learning (DFL) in linear programs by proposing a solver-free training objective, lava, that avoids solving the predicted LP during training. Lava leverages LP geometry by precomputing adjacent vertices to the ground-truth optimum and enforcing that the predicted cost vector ranks the optimum above its neighbors through a convex, hinge-like loss. The approach dramatically reduces training time while maintaining competitive decision quality, especially on nondegenerate problems, and scales favorably with problem size. It also provides a practical framework for efficient DFL in real-world linear optimization settings, with a clear path to extending to ILPs/MILPs via LP relaxations.

Abstract

Mathematical optimization is a fundamental tool for decision-making in a wide range of applications. However, in many real-world scenarios, the parameters of the optimization problem are not known a priori and must be predicted from contextual features. This gives rise to predict-then-optimize problems, where a machine learning model predicts problem parameters that are then used to make decisions via optimization. A growing body of work on decision-focused learning (DFL) addresses this setting by training models specifically to produce predictions that maximize downstream decision quality, rather than accuracy. While effective, DFL is computationally expensive, because it requires solving the optimization problem with the predicted parameters at each loss evaluation. In this work, we address this computational bottleneck for linear optimization problems, a common class of problems in both DFL literature and real-world applications. We propose a solver-free training method that exploits the geometric structure of linear optimization to enable efficient training with minimal degradation in solution quality. Our method is based on the insight that a solution is optimal if and only if it achieves an objective value that is at least as good as that of its adjacent vertices on the feasible polytope. Building on this, our method compares the estimated quality of the ground-truth optimal solution with that of its precomputed adjacent vertices, and uses this as loss function. Experiments demonstrate that our method significantly reduces computational cost while maintaining high decision quality.

Solver-Free Decision-Focused Learning for Linear Optimization Problems

TL;DR

This paper tackles the computational bottleneck of decision-focused learning (DFL) in linear programs by proposing a solver-free training objective, lava, that avoids solving the predicted LP during training. Lava leverages LP geometry by precomputing adjacent vertices to the ground-truth optimum and enforcing that the predicted cost vector ranks the optimum above its neighbors through a convex, hinge-like loss. The approach dramatically reduces training time while maintaining competitive decision quality, especially on nondegenerate problems, and scales favorably with problem size. It also provides a practical framework for efficient DFL in real-world linear optimization settings, with a clear path to extending to ILPs/MILPs via LP relaxations.

Abstract

Mathematical optimization is a fundamental tool for decision-making in a wide range of applications. However, in many real-world scenarios, the parameters of the optimization problem are not known a priori and must be predicted from contextual features. This gives rise to predict-then-optimize problems, where a machine learning model predicts problem parameters that are then used to make decisions via optimization. A growing body of work on decision-focused learning (DFL) addresses this setting by training models specifically to produce predictions that maximize downstream decision quality, rather than accuracy. While effective, DFL is computationally expensive, because it requires solving the optimization problem with the predicted parameters at each loss evaluation. In this work, we address this computational bottleneck for linear optimization problems, a common class of problems in both DFL literature and real-world applications. We propose a solver-free training method that exploits the geometric structure of linear optimization to enable efficient training with minimal degradation in solution quality. Our method is based on the insight that a solution is optimal if and only if it achieves an objective value that is at least as good as that of its adjacent vertices on the feasible polytope. Building on this, our method compares the estimated quality of the ground-truth optimal solution with that of its precomputed adjacent vertices, and uses this as loss function. Experiments demonstrate that our method significantly reduces computational cost while maintaining high decision quality.

Paper Structure

This paper contains 31 sections, 2 theorems, 12 equations, 5 figures, 3 tables, 1 algorithm.

Key Result

Proposition 4.1

Let $z$ be a vertex of the feasible polytope, and let $Z_{adj}(z)$ be the set of vertices adjacent to $z$. Vertex $z$ is an optimal solution for cost vector $c$ if and only if $c^\top z \leq c^\top z_{adj}$ for all $z_{adj} \in Z_{adj}(z)$.

Figures (5)

  • Figure 1: Optimal solution $z^\star$ has two adjacent vertices, $z_{adj}^{(1)}$ and $z_{adj}^{(2)}$. Geometrically, these vertices define the facets of the optimality cone of $z^\star$. All cost vectors $c$ for which $-c$ lies within this cone lead to $z^\star$. Cost vector $\hat{c}$ is predicted. $\hat{c}$ correctly prefers $z^\star$ to $z_{adj}^{(1)}$ (i.e., $\hat{c}^\top z^\star < \hat{c}^\top {\color{blue}z_{adj}^{(1)}}$), and thus lies on the correct side of the corresponding facet. This adds no penality to $\mathcal{L}_{\text{AVA}}$. However, $\hat{c}$ wrongly prefers $z_{adj}^{(2)}$ to $z^\star$, and thus lies on the wrong side of the corresponding facet. This adds $\hat{c}^\top z^\star - \hat{c}^\top {\color{red} z_{adj}^{(2)}}$ to $\mathcal{L}_{\text{AVA}}$.
  • Figure 2: Comparison of the efficiency of different methods
  • Figure 3: Breakdown of lava's training time
  • Figure 4: Comparison of performance in function of the problem size of multi-dimensional knapsack
  • Figure 5: Learning curves for lava on the random LP benchmark, comparing margin parameter values $\epsilon = 0$ and $\epsilon = 0.1$.

Theorems & Definitions (9)

  • Definition 2.1: Basis
  • Definition 2.2: Basic solution
  • Definition 2.3: Basic feasible solution (BFS)
  • Definition 2.4: Degeneracy
  • Definition 2.5: Adjacent bases
  • Definition 2.6: Adjacent vertices
  • Proposition 4.1
  • Proposition A.1
  • proof