Table of Contents
Fetching ...

Fast Computation of Superquantile-Constrained Optimization Through Implicit Scenario Reduction

Jake Roth, Ying Cui

TL;DR

The paper develops a fast, scalable solver for convex optimization with superquantile (CVaR) constraints by combining a double-loop augmented Lagrangian framework with a semismooth Newton inner solver. It exploits the tail-structure of CVaR via an implicit scenario reduction, yielding a low-dimensional Newton system whose size is driven by the tail set rather than the total number of scenarios $m$. The method achieves substantial speedups over state-of-the-art solvers (e.g., GRB, OSQP, PSG) on large-scale problems and enables efficient computation of CVaR-based solution paths, including real-data quantile regression with tens of millions of samples. This approach offers robust performance when $m\gg n$, with practical gains from warm-starting and reduced reliance on manual scenario pruning, and it can extend to nonlinear $G$ with the same tail-sparsity benefits.

Abstract

Superquantiles have recently gained significant interest as a risk-aware metric for addressing fairness and distribution shifts in statistical learning and decision making problems. This paper introduces a fast, scalable and robust second-order computational framework to solve large-scale optimization problems with superquantile-based constraints. Unlike empirical risk minimization, superquantile-based optimization requires ranking random functions evaluated across all scenarios to compute the tail conditional expectation. While this tail-based feature might seem computationally unfriendly, it provides an advantageous setting for a semismooth-Newton-based augmented Lagrangian method. The superquantile operator effectively reduces the dimensions of the Newton systems since the tail expectation involves considerably fewer scenarios. Notably, the extra cost of obtaining relevant second-order information and performing matrix inversions is often comparable to, and sometimes even less than, the effort required for gradient computation. Our developed solver is particularly effective when the number of scenarios substantially exceeds the number of decision variables. In synthetic problems with linear and convex diagonal quadratic objectives, numerical experiments demonstrate that our method outperforms existing approaches by a large margin: It achieves speeds more than 750 times faster for linear and quadratic objectives than the alternating direction method of multipliers as implemented by OSQP for computing low-accuracy solutions. Additionally, it is up to 25 times faster for linear objectives and 70 times faster for quadratic objectives than the commercial solver Gurobi, and 20 times faster for linear objectives and 30 times faster for quadratic objectives than the Portfolio Safeguard optimization suite for high-accuracy solution computations.

Fast Computation of Superquantile-Constrained Optimization Through Implicit Scenario Reduction

TL;DR

The paper develops a fast, scalable solver for convex optimization with superquantile (CVaR) constraints by combining a double-loop augmented Lagrangian framework with a semismooth Newton inner solver. It exploits the tail-structure of CVaR via an implicit scenario reduction, yielding a low-dimensional Newton system whose size is driven by the tail set rather than the total number of scenarios . The method achieves substantial speedups over state-of-the-art solvers (e.g., GRB, OSQP, PSG) on large-scale problems and enables efficient computation of CVaR-based solution paths, including real-data quantile regression with tens of millions of samples. This approach offers robust performance when , with practical gains from warm-starting and reduced reliance on manual scenario pruning, and it can extend to nonlinear with the same tail-sparsity benefits.

Abstract

Superquantiles have recently gained significant interest as a risk-aware metric for addressing fairness and distribution shifts in statistical learning and decision making problems. This paper introduces a fast, scalable and robust second-order computational framework to solve large-scale optimization problems with superquantile-based constraints. Unlike empirical risk minimization, superquantile-based optimization requires ranking random functions evaluated across all scenarios to compute the tail conditional expectation. While this tail-based feature might seem computationally unfriendly, it provides an advantageous setting for a semismooth-Newton-based augmented Lagrangian method. The superquantile operator effectively reduces the dimensions of the Newton systems since the tail expectation involves considerably fewer scenarios. Notably, the extra cost of obtaining relevant second-order information and performing matrix inversions is often comparable to, and sometimes even less than, the effort required for gradient computation. Our developed solver is particularly effective when the number of scenarios substantially exceeds the number of decision variables. In synthetic problems with linear and convex diagonal quadratic objectives, numerical experiments demonstrate that our method outperforms existing approaches by a large margin: It achieves speeds more than 750 times faster for linear and quadratic objectives than the alternating direction method of multipliers as implemented by OSQP for computing low-accuracy solutions. Additionally, it is up to 25 times faster for linear objectives and 70 times faster for quadratic objectives than the commercial solver Gurobi, and 20 times faster for linear objectives and 30 times faster for quadratic objectives than the Portfolio Safeguard optimization suite for high-accuracy solution computations.
Paper Structure (19 sections, 3 theorems, 64 equations, 5 figures, 3 tables, 2 algorithms)

This paper contains 19 sections, 3 theorems, 64 equations, 5 figures, 3 tables, 2 algorithms.

Key Result

Lemma 1

Let $1\leq k\leq m$, $y=\reflectbox{\vec{\reflectbox{y}}}\in\mathbb{R}^m$ be a given sorted vector, and $\bar{y}$ be its projection with associated index-pair $(\bar{k}_0, \bar{k}_1)$ in eq:order_structure and index-sets $\{\bar{\alpha}, \bar{\beta}, \bar{\gamma}\}$ in eq:index. Let $(\bar{y}, \bar{

Figures (5)

  • Figure 1: (a) Schematic of the structured sparsity (with affine $G(x) \coloneqq Ax + b$ and a single superquantile constraint) for constructing the generalized Jacobian at a sorted point $\reflectbox{\vec{\reflectbox{y}}}^0\in\mathbb{R}^m$ based on (b) sorted projection $\bar{y}\in\mathcal{B}_k$ with index sets $\bar{\alpha},\bar{\beta},\bar{\gamma}$. Only the rows $A_{j,:}$ for $j\in \bar{\alpha}\cup \bar{\beta}$ are relevant in the semismooth Newton equation. See \ref{['eq:varphi_hessian']} for the case of the nonlinear function $G$.
  • Figure 2: Heatmap of relative solve time (method $i$$-$ ALM)/ALM, for $i\in\{$a = GRB, b = G-OA, c = PSG, d = OSQP$\}$. Subfigures (i)--(iv) ((v)--(vii)) correspond to instances with linear (separable quadratic) objective. Red shading (positive) indicates that ALM was faster than method $i$; blue shading (negative) indicates that method $i$ was faster than ALM. Shading intensity represents relative solve time. Vertical (horizontal) hatching indicates that method $i$ (ALM) failed to obtain a satisfactory solution within 1hr. For convenience: $2^{20}=1.0\times10^6$, $2^{17}=1.3\times10^5$, and $2^{13}=0.8\times10^4$.
  • Figure :
  • Figure :
  • Figure :

Theorems & Definitions (3)

  • Lemma 1
  • Lemma 2
  • Lemma 3