A Penalty Approach for Differentiation Through Black-Box Quadratic Programming Solvers

Yuxuan Linghu; Zhiyuan Liu; Qi Deng

A Penalty Approach for Differentiation Through Black-Box Quadratic Programming Solvers

Yuxuan Linghu, Zhiyuan Liu, Qi Deng

TL;DR

Differentiating through the solution of convex quadratic programs (QPs) is central to differentiable optimization but KKT-based gradients incur high cost and can be numerically unstable at scale. The paper introduces dXPP, a penalty-based differentiation framework that decouples solving from differentiation: the forward pass remains solver-agnostic and the backward pass differentiates through a smooth penalty reformulation, reducing gradient computation to solving a primal SPD system. It provides a convergence guarantee showing the penalty-based gradient converges to the true KKT gradient as the smoothing vanishes, and demonstrates substantial speedups and robustness on large-scale sparse projection problems and a real-world multi-period portfolio optimization task. The approach enables efficient end-to-end learning with black-box QP solvers, offering solver-agnostic forward computation and scalable, accurate gradients for differentiable programming.

Abstract

Differentiating through the solution of a quadratic program (QP) is a central problem in differentiable optimization. Most existing approaches differentiate through the Karush--Kuhn--Tucker (KKT) system, but their computational cost and numerical robustness can degrade at scale. To address these limitations, we propose dXPP, a penalty-based differentiation framework that decouples QP solving from differentiation. In the solving step (forward pass), dXPP is solver-agnostic and can leverage any black-box QP solver. In the differentiation step (backward pass), we map the solution to a smooth approximate penalty problem and implicitly differentiate through it, requiring only the solution of a much smaller linear system in the primal variables. This approach bypasses the difficulties inherent in explicit KKT differentiation and significantly improves computational efficiency and robustness. We evaluate dXPP on various tasks, including randomly generated QPs, large-scale sparse projection problems, and a real-world multi-period portfolio optimization task. Empirical results demonstrate that dXPP is competitive with KKT-based differentiation methods and achieves substantial speedups on large-scale problems.

A Penalty Approach for Differentiation Through Black-Box Quadratic Programming Solvers

TL;DR

Abstract

Paper Structure (30 sections, 4 theorems, 49 equations, 4 figures, 5 tables, 1 algorithm)

This paper contains 30 sections, 4 theorems, 49 equations, 4 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Differentiable Optimization
Penalty Methods
Method
Differentiating Through the KKT Conditions
Penalty Reformulation and Implicit Differentiation
Plug-in Sensitivity and Consistency Analysis
Computational Advantages
Solver-agnostic forward pass
dXPP as an embedding layer
Reduction to primal-dimensional linear systems
Support for active-set pruning
Experiments
Gradient Accuracy
...and 15 more sections

Key Result

Proposition 1

Let $(z^\star, \nu^\star, \mu^\star)$ be an optimal primal--dual solution of the QP eq:qp. If the penalty weights satisfy then $z^\star$ is also a minimizer of the exact penalty problem pb:penalty.

Figures (4)

Figure 1: The learning workflow of dXPP: the previous layer outputs parameters $\theta$ and the QP layer returns the optimal solution $z^\star$ in the forward pass; the backward pass propagates loss gradients for end-to-end learning. By decoupling solving and differentiating through a penalty-based reformulation, dXPP enables efficient and scalable training with black-box QP solvers.
Figure 2: Backward runtime ratio ($t_{\texttt{dXPP}}^{\texttt{bwd}}/t_{\texttt{dQP}}^{\texttt{bwd}}$) for structured projection problems. The black dashed line represents the baseline (dQP's backward time normalized to 1). Exact results are reported in Table \ref{['tab:scalability_random_projection']} and Table \ref{['tab:scalability_chain']}.
Figure 3: Visualization of the runtime results in Table \ref{['tab:portfolio_realworld_runtime']}.
Figure 4: Sudoku experiments: training loss (left) and error rate (right) for dXPP, dQP, and OptNet.

Theorems & Definitions (9)

Proposition 1
Remark 1
Proposition 2
Proposition 3
Theorem 1
proof
proof
proof
proof

A Penalty Approach for Differentiation Through Black-Box Quadratic Programming Solvers

TL;DR

Abstract

A Penalty Approach for Differentiation Through Black-Box Quadratic Programming Solvers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (9)