Table of Contents
Fetching ...

Data-driven Projection Generation for Efficiently Solving Heterogeneous Quadratic Programming Problems

Tomoharu Iwata, Futoshi Futami

TL;DR

This work presents a data-driven framework for efficiently solving large, heterogeneous QP problems by learning instance-specific projection matrices that reduce dimensionality from $N$ to $K$ via a graph neural network. The projected QP is solved with a solver, and the original solution is obtained by back-projecting, ensuring feasibility while aiming to minimize the original objective. Training uses a bilevel formulation with gradients computed through the envelope theorem, avoiding backpropagation through the inner solver, and the approach is backed by a generalization bound based on covering numbers and Lipschitz properties. Empirical results on Regression, Portfolio, and Control datasets show strong solution quality, feasibility, and substantial speedups over solving the full QP, with robust generalization across varying problem sizes and structures.

Abstract

We propose a data-driven framework for efficiently solving quadratic programming (QP) problems by reducing the number of variables in high-dimensional QPs using instance-specific projection. A graph neural network-based model is designed to generate projections tailored to each QP instance, enabling us to produce high-quality solutions even for previously unseen problems. The model is trained on heterogeneous QPs to minimize the expected objective value evaluated on the projected solutions. This is formulated as a bilevel optimization problem; the inner optimization solves the QP under a given projection using a QP solver, while the outer optimization updates the model parameters. We develop an efficient algorithm to solve this bilevel optimization problem, which computes parameter gradients without backpropagating through the solver. We provide a theoretical analysis of the generalization ability of solving QPs with projection matrices generated by neural networks. Experimental results demonstrate that our method produces high-quality feasible solutions with reduced computation time, outperforming existing methods.

Data-driven Projection Generation for Efficiently Solving Heterogeneous Quadratic Programming Problems

TL;DR

This work presents a data-driven framework for efficiently solving large, heterogeneous QP problems by learning instance-specific projection matrices that reduce dimensionality from to via a graph neural network. The projected QP is solved with a solver, and the original solution is obtained by back-projecting, ensuring feasibility while aiming to minimize the original objective. Training uses a bilevel formulation with gradients computed through the envelope theorem, avoiding backpropagation through the inner solver, and the approach is backed by a generalization bound based on covering numbers and Lipschitz properties. Empirical results on Regression, Portfolio, and Control datasets show strong solution quality, feasibility, and substantial speedups over solving the full QP, with robust generalization across varying problem sizes and structures.

Abstract

We propose a data-driven framework for efficiently solving quadratic programming (QP) problems by reducing the number of variables in high-dimensional QPs using instance-specific projection. A graph neural network-based model is designed to generate projections tailored to each QP instance, enabling us to produce high-quality solutions even for previously unseen problems. The model is trained on heterogeneous QPs to minimize the expected objective value evaluated on the projected solutions. This is formulated as a bilevel optimization problem; the inner optimization solves the QP under a given projection using a QP solver, while the outer optimization updates the model parameters. We develop an efficient algorithm to solve this bilevel optimization problem, which computes parameter gradients without backpropagating through the solver. We provide a theoretical analysis of the generalization ability of solving QPs with projection matrices generated by neural networks. Experimental results demonstrate that our method produces high-quality feasible solutions with reduced computation time, outperforming existing methods.

Paper Structure

This paper contains 26 sections, 2 theorems, 43 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Theorem 4.2

Under Assumption asm_1, for any $\varepsilon>0$ and $\delta\in(0,1)$, with probability at least $1-\delta$ over the draws of $\{\bm{\pi}_{d}\}_{d=1}^{D}$, we have where $C(\varPi)$ depends only on $(\sigma_{\mathrm{Q}},\sigma_{\mathrm{P}},Q_0,c_0,B)$; its explicit form appears in Appendix app_proof.

Figures (8)

  • Figure 1: Our framework. In the training phase, our model is trained using multiple QPs. In the test phase, we obtain a solution of test QPs using the trained model, where training and test QPs are different.
  • Figure 2: Our method to obtain a solution of a QP. 1) Given an original QP, an instance-specific projection matrix is generated using our model. Here, $N$ is the number of variables, $M$ is the number of constraints, and $K$ is the projection dimension. 2) The number of variables is reduced by the projection. 3) An optimal solution of the projected QP is obtained using a QP solver. 4) A solution of the original QP is recovered from the solution of the projected QP by $\tilde{\mathbf{x}} = \mathbf{P} \mathbf{y}^*$.
  • Figure 3: Graph for a QP. There are $N$ variable nodes and $M$ constraint nodes. Variable nodes are connected with weight $Q_{nn'}$. Variable and constraint nodes are connected with weight $A_{mn}$. Initial embeddings of variable and constraint nodes are calculated using QP parameters $c_{n}$ and $b_{m}$, respectively.
  • Figure 4: Average relative errors (top) and computation times in seconds for solving a QP (bottom) with varying numbers of reduced variables $K$. Bars show standard errors.
  • Figure 5: Average relative errors (top) and computation time in seconds for solving QPs (bottom) with varying numbers of variables $N$ in the test set. All data-driven methods were trained on QPs with $N = 500$, and projection-based methods used $K = 30$ reduced variables. Bars show the standard error.
  • ...and 3 more figures

Theorems & Definitions (3)

  • Theorem 4.2
  • Lemma B.1: Lipschitz continuity of $u$ w.r.t. $\mathbf{P}$
  • proof