Generalization Bound and Learning Methods for Data-Driven Projections in Linear Programming

Shinsaku Sakaue; Taihei Oki

Generalization Bound and Learning Methods for Data-Driven Projections in Linear Programming

Shinsaku Sakaue, Taihei Oki

TL;DR

This work investigates data-driven projections to accelerate solving high-dimensional linear programs by learning a projection matrix $P\in\mathbb{R}^{n\times k}$ from past LP instances, reducing to a $k$-dimensional problem and recovering a feasible $n$-dimensional solution. It develops a theoretical generalization framework, establishing a $\tilde{O}(nk^2)$ upper bound and an $\Omega(nk)$ lower bound on the pseudo-dimension of the performance class, indicating near-tightness; it also introduces two practical learning methods, PCA-based and gradient-based (SGA), along with a final feasibility projection. The paper demonstrates that data-driven projections can yield significantly higher solution quality than random projections while achieving substantial speedups in solving LPs, validated on synthetic and real-world datasets. Overall, the results support a solver-agnostic, data-driven approach to LP dimensionality reduction with strong theoretical guarantees and practical impact for repeated/related LP instances.

Abstract

How to solve high-dimensional linear programs (LPs) efficiently is a fundamental question. Recently, there has been a surge of interest in reducing LP sizes using random projections, which can accelerate solving LPs independently of improving LP solvers. This paper explores a new direction of data-driven projections, which use projection matrices learned from data instead of random projection matrices. Given training data of $n$-dimensional LPs, we learn an $n\times k$ projection matrix with $n > k$. When addressing a future LP instance, we reduce its dimensionality from $n$ to $k$ via the learned projection matrix, solve the resulting LP to obtain a $k$-dimensional solution, and apply the learned matrix to it to recover an $n$-dimensional solution. On the theoretical side, a natural question is: how much data is sufficient to ensure the quality of recovered solutions? We address this question based on the framework of data-driven algorithm design, which connects the amount of data sufficient for establishing generalization bounds to the pseudo-dimension of performance metrics. We obtain an $\tilde{\mathrm{O}}(nk^2)$ upper bound on the pseudo-dimension, where $\tilde{\mathrm{O}}$ compresses logarithmic factors. We also provide an $Ω(nk)$ lower bound, implying our result is tight up to an $\tilde{\mathrm{O}}(k)$ factor. On the practical side, we explore two simple methods for learning projection matrices: PCA- and gradient-based methods. While the former is relatively efficient, the latter can sometimes achieve better solution quality. Experiments demonstrate that learning projection matrices from data is indeed beneficial: it leads to significantly higher solution quality than the existing random projection while greatly reducing the time for solving LPs.

Generalization Bound and Learning Methods for Data-Driven Projections in Linear Programming

TL;DR

This work investigates data-driven projections to accelerate solving high-dimensional linear programs by learning a projection matrix

from past LP instances, reducing to a

-dimensional problem and recovering a feasible

-dimensional solution. It develops a theoretical generalization framework, establishing a

upper bound and an

lower bound on the pseudo-dimension of the performance class, indicating near-tightness; it also introduces two practical learning methods, PCA-based and gradient-based (SGA), along with a final feasibility projection. The paper demonstrates that data-driven projections can yield significantly higher solution quality than random projections while achieving substantial speedups in solving LPs, validated on synthetic and real-world datasets. Overall, the results support a solver-agnostic, data-driven approach to LP dimensionality reduction with strong theoretical guarantees and practical impact for repeated/related LP instances.

Abstract

-dimensional LPs, we learn an

projection matrix with

. When addressing a future LP instance, we reduce its dimensionality from

via the learned projection matrix, solve the resulting LP to obtain a

-dimensional solution, and apply the learned matrix to it to recover an

-dimensional solution. On the theoretical side, a natural question is: how much data is sufficient to ensure the quality of recovered solutions? We address this question based on the framework of data-driven algorithm design, which connects the amount of data sufficient for establishing generalization bounds to the pseudo-dimension of performance metrics. We obtain an

upper bound on the pseudo-dimension, where

compresses logarithmic factors. We also provide an

lower bound, implying our result is tight up to an

factor. On the practical side, we explore two simple methods for learning projection matrices: PCA- and gradient-based methods. While the former is relatively efficient, the latter can sometimes achieve better solution quality. Experiments demonstrate that learning projection matrices from data is indeed beneficial: it leads to significantly higher solution quality than the existing random projection while greatly reducing the time for solving LPs.

Paper Structure (20 sections, 3 theorems, 13 equations, 3 figures, 1 table)

This paper contains 20 sections, 3 theorems, 13 equations, 3 figures, 1 table.

Introduction
Our contribution
Related work
Reducing dimensionality of LPs via projection
Data-driven projection
Generalization bound
Upper bound on $\mathrm{pdim}(\mathcal{U})$
Lower bound on $\mathrm{pdim}(\mathcal{U})$
Learning methods
PCA-based method
Gradient-based method
Final projection for feasibility
Experiments
Conclusion
Limitations and discussions
...and 5 more sections

Key Result

Lemma 4.3

Let $t \in \mathbb{R}$ be a threshold value. Consider an LP $\tilde{\pi} = (\tilde{\bm{c}}, \tilde{\bm{A}}, \tilde{\bm{b}}) \in \mathbb{R}^k\times \mathbb{R}^{m\times k} \times \mathbb{R}^m$ such that each entry of $\tilde{\bm{c}}$, $\tilde{\bm{A}}$, and $\tilde{\bm{b}}$ is a polynomial of degree at

Figures (3)

Figure 1: Plots of objective ratios (upper) and Gurobi's running times (lower, semi-log) for Full, ColRand, PCA, and SGA averaged over 100 test instances. The error band of ColRand indicates the standard deviation over $10$ independent trials. The results of Full are shown for every $k$ for reference, although it always solves $n$-dimensional LPs and hence is independent of $k$.
Figure 2: Running times of PCA and SGA for learning projection matrices on 200 training instances.
Figure 3: The same plots as in \ref{['fig:results']} but with SGA initialized with ColRand instead of PCA, as mentioned in \ref{['footnote:rand-init']}. Compared with \ref{['fig:results']}, the objective ratio of SGA deteriorates particularly in MinCostFlow, GROW7, SC205, and SCAGR25.

Theorems & Definitions (10)

Remark 2.1: Solver-specific aspects
Remark 3.2: Validity of the setting
Definition 4.1
Remark 4.2: Importance of uniform convergence
Lemma 4.3
proof
Theorem 4.4
proof
Theorem 4.5
Remark 5.1: Training time

Generalization Bound and Learning Methods for Data-Driven Projections in Linear Programming

TL;DR

Abstract

Generalization Bound and Learning Methods for Data-Driven Projections in Linear Programming

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (10)