Table of Contents
Fetching ...

Exact Verification of First-Order Methods via Mixed-Integer Linear Programming

Vinit Ranjan, Jisun Park, Stefano Gualandi, Andrea Lodi, Bartolomeo Stellato

TL;DR

This work develops an exact, scalable MILP framework to verify the finite-time performance of first-order methods on parametric convex QPs by maximizing the fixed-point residual after a fixed number of iterations. By encoding iterative updates as affine and piecewise-affine steps and constructing convex hulls for non-smooth operators, the approach can represent a wide class of gradient, projection, and proximal steps within a single MILP, while bound tightening and sequential solving enhance scalability. The authors demonstrate substantial accuracy gains over SDP-based PEP methods across sparse coding (Lasso), network flow, and optimal control problems, achieving several orders of magnitude reductions in worst-case residuals and enabling verification up to K in the tens of iterations. The results suggest a practical, exact verification tool for assessing real-time performance guarantees of first-order methods in parametric settings, with potential extensions to broader convex settings and warm-start configurations.

Abstract

We present exact mixed-integer linear programming formulations for verifying the performance of first-order methods for parametric quadratic optimization. We formulate the verification problem as a mixed-integer linear program where the objective is to maximize the infinity norm of the fixed-point residual after a given number of iterations. Our approach captures a wide range of gradient, projection, proximal iterations through affine or piecewise affine constraints. We derive tight polyhedral convex hull formulations of the constraints representing the algorithm iterations. To improve the scalability, we develop a custom bound tightening technique combining interval propagation, operator theory, and optimization-based bound tightening. Numerical examples, including linear and quadratic programs from network optimization, sparse coding using Lasso, and optimal control, show that our method provides several orders of magnitude reductions in the worst-case fixed-point residuals, closely matching the true worst-case performance.

Exact Verification of First-Order Methods via Mixed-Integer Linear Programming

TL;DR

This work develops an exact, scalable MILP framework to verify the finite-time performance of first-order methods on parametric convex QPs by maximizing the fixed-point residual after a fixed number of iterations. By encoding iterative updates as affine and piecewise-affine steps and constructing convex hulls for non-smooth operators, the approach can represent a wide class of gradient, projection, and proximal steps within a single MILP, while bound tightening and sequential solving enhance scalability. The authors demonstrate substantial accuracy gains over SDP-based PEP methods across sparse coding (Lasso), network flow, and optimal control problems, achieving several orders of magnitude reductions in worst-case residuals and enabling verification up to K in the tens of iterations. The results suggest a practical, exact verification tool for assessing real-time performance guarantees of first-order methods in parametric settings, with potential extensions to broader convex settings and warm-start configurations.

Abstract

We present exact mixed-integer linear programming formulations for verifying the performance of first-order methods for parametric quadratic optimization. We formulate the verification problem as a mixed-integer linear program where the objective is to maximize the infinity norm of the fixed-point residual after a given number of iterations. Our approach captures a wide range of gradient, projection, proximal iterations through affine or piecewise affine constraints. We derive tight polyhedral convex hull formulations of the constraints representing the algorithm iterations. To improve the scalability, we develop a custom bound tightening technique combining interval propagation, operator theory, and optimization-based bound tightening. Numerical examples, including linear and quadratic programs from network optimization, sparse coding using Lasso, and optimal control, show that our method provides several orders of magnitude reductions in the worst-case fixed-point residuals, closely matching the true worst-case performance.

Paper Structure

This paper contains 48 sections, 6 theorems, 57 equations, 11 figures.

Key Result

Theorem 2.1

Consider the region $\Phi = \{(y, w) \in [\text{\b{$y$}}, \bar{y}] \times {\hbox{\bf R}} \mid w = \mathcal{T}_\lambda(a^Ty)\}$. We can write its convex hull as where

Figures (11)

  • Figure 1: Left: Example of soft-thresholding $w = \mathcal{T}_\lambda(y_1 + y_2)$ with $\lambda = 0.5$ variable bounds $\text{\b{$y$}}_1 \leq y_1 \leq \bar{y}_1$ and $\text{\b{$y$}}_2 \leq y_2 \leq \bar{y}_2$; the two lines $y_1 + y_2 + \lambda = 0$ and $y_1 + y_2 - \lambda = 0$ delimit the area where $y=0$. Right: the convex hull of $w= \phi_\lambda(y_1 + y_2)$ given by Theorem \ref{['thm:softthreshold_convhull']}.
  • Figure 2: Left: Worst-case fixed-point residual norm for ISTA applied to \ref{['prob:lasso']} via VP and PEP. Right: Computation time to solve the respective optimization problems.
  • Figure 3: Worst-case fixed point residual norms for the ISTA (first column) and FISTA (second column) applied to \ref{['prob:lasso']}. In both the strongly convex case (first row, $p=20, n=15$) and the nonstrongly convex case (second row, $p=15, n=20$), the acceleration results in a reduction in the worst-case residual, with a larger gain in the nonstrongly convex case at $K=40$.
  • Figure 4: Solve times for the vanilla MILP defined in Section \ref{['sec:milp_formulation']} vs. the MILP with our scalability improvements in Section \ref{['sec:scalability']} on the same nonstrongly convex instance.
  • Figure 5: Worst-case fixed-point residual norms for the standard PDHG algorithm (left) and the version with momentum (right). Both algorithms show a rippling behavior and the momentum algorithm obtains a lower residual at $K=70$. The inset on the standard PDHG plot shows that SM is much closer to the VP bound than in the momentum plot.
  • ...and 6 more figures

Theorems & Definitions (13)

  • Theorem 2.1: Convex hull of soft-thresholding operator
  • proof
  • Proposition 2.1
  • proof
  • Remark 2.1
  • Theorem 2.2: Convex hull of rectified linear unit tjandraatmadjaConvexRelaxationBarrier2020
  • Theorem 2.3: Convex hull of saturated linear unit
  • proof
  • Remark 2.2
  • Proposition 3.1
  • ...and 3 more