Table of Contents
Fetching ...

On $O(n)$ Algorithms for Projection onto the Top-$k$-sum Sublevel Set

Jake Roth, Ying Cui

TL;DR

This work addresses the problem of fast, exact Euclidean projection onto the top-$k$-sum sublevel set, a subroutine central to composite superquantile optimization. It introduces two finite-termination solvers that operate in $O(n)$ time for sorted inputs: a parametric-LCP (PLCP) approach leveraging a $Z$-matrix structure and a refined early-stopping grid-search (ESGS) that exploits KKT structure. Unsorted inputs incur an additional $O(n\log n)$ cost from sorting, but partial sorting and online permutation strategies can substantially reduce practical effort in iterative settings. Numerical experiments demonstrate dramatic speedups over grid-search, semismooth Newton, and generic QP solvers, enabling scale to $n=10^7$ with $k=10^4$ in fractions of a second. The methods extend to related vector-$k$-norm projections and offer a practical oracle for large-scale risk-averse optimization problems, with potential applications to sequential projections in iterative methods and to the vector-$k$-norm ball.

Abstract

The \emph{top-$k$-sum} operator computes the sum of the largest $k$ components of a given vector. The Euclidean projection onto the top-$k$-sum sublevel set serves as a crucial subroutine in iterative methods to solve composite superquantile optimization problems. In this paper, we introduce a solver that implements two finite-termination algorithms to compute this projection. Both algorithms have $O(n)$ complexity of floating point operations when applied to a sorted $n$-dimensional input vector, where the absorbed constant is \emph{independent of $k$}. This stands in contrast to an existing grid-search-inspired method that has $O(k(n-k))$ complexity, a partition-based method with $O(n+D\log D)$ complexity, where $D\leq n$ is the number of distinct elements in the input vector, and a semismooth Newon method with a finite termination property but unspecified floating point complexity. The improvement of our methods over the first method is significant when $k$ is linearly dependent on $n$, which is frequently encountered in practical superquantile optimization applications. In instances where the input vector is unsorted, an additional cost is incurred to (partially) sort the vector, whereas a full sort of the input vector seems unavoidable for the other two methods. To reduce this cost, we further derive a rigorous procedure that leverages approximate sorting to compute the projection, which is particularly useful when solving a sequence of similar projection problems. Numerical results show that our methods solve problems of scale $n=10^7$ and $k=10^4$ within $0.05$ seconds, whereas the most competitive alternative, the semismooth Newton-based method, takes about $1$ second. The existing grid-search method and Gurobi's QP solver can take from minutes to hours.

On $O(n)$ Algorithms for Projection onto the Top-$k$-sum Sublevel Set

TL;DR

This work addresses the problem of fast, exact Euclidean projection onto the top--sum sublevel set, a subroutine central to composite superquantile optimization. It introduces two finite-termination solvers that operate in time for sorted inputs: a parametric-LCP (PLCP) approach leveraging a -matrix structure and a refined early-stopping grid-search (ESGS) that exploits KKT structure. Unsorted inputs incur an additional cost from sorting, but partial sorting and online permutation strategies can substantially reduce practical effort in iterative settings. Numerical experiments demonstrate dramatic speedups over grid-search, semismooth Newton, and generic QP solvers, enabling scale to with in fractions of a second. The methods extend to related vector--norm projections and offer a practical oracle for large-scale risk-averse optimization problems, with potential applications to sequential projections in iterative methods and to the vector--norm ball.

Abstract

The \emph{top--sum} operator computes the sum of the largest components of a given vector. The Euclidean projection onto the top--sum sublevel set serves as a crucial subroutine in iterative methods to solve composite superquantile optimization problems. In this paper, we introduce a solver that implements two finite-termination algorithms to compute this projection. Both algorithms have complexity of floating point operations when applied to a sorted -dimensional input vector, where the absorbed constant is \emph{independent of }. This stands in contrast to an existing grid-search-inspired method that has complexity, a partition-based method with complexity, where is the number of distinct elements in the input vector, and a semismooth Newon method with a finite termination property but unspecified floating point complexity. The improvement of our methods over the first method is significant when is linearly dependent on , which is frequently encountered in practical superquantile optimization applications. In instances where the input vector is unsorted, an additional cost is incurred to (partially) sort the vector, whereas a full sort of the input vector seems unavoidable for the other two methods. To reduce this cost, we further derive a rigorous procedure that leverages approximate sorting to compute the projection, which is particularly useful when solving a sequence of similar projection problems. Numerical results show that our methods solve problems of scale and within seconds, whereas the most competitive alternative, the semismooth Newton-based method, takes about second. The existing grid-search method and Gurobi's QP solver can take from minutes to hours.
Paper Structure (19 sections, 6 theorems, 40 equations, 5 figures, 1 table, 3 algorithms)

This paper contains 19 sections, 6 theorems, 40 equations, 5 figures, 1 table, 3 algorithms.

Key Result

Proposition 1

The overall complexity to solve the sorted problem eq:maxksum_projection_sort by alg:projection_maxksum_lcp is $O(n)$.

Figures (5)

  • Figure 1: Schematic of sorted input $\reflectbox{\vec{\reflectbox{x}}}^0\in\mathbb{R}^n$ (grey, top) and sorted projection $\bar{x}\in\mathcal{B}^{r}_{(k)}$ (blue, bottom).
  • Figure 2: Schematic for \ref{['alg:projection_maxksum_snake']}. Orange shading indicates that $\mathtt{kkt}_2\land\mathtt{kkt}_3$ holds; blue shading indicates that $\neg\mathtt{kkt}_2\land\mathtt{kkt}_3$ holds; red shading indicates that $\mathtt{kkt}_4\land\mathtt{kkt}_5$ holds; black circles trace the trajectory taken by ESGS and indicate that $\mathtt{kkt}_1\land\mathtt{kkt}_3\land\mathtt{kkt}_4$ holds.
  • Figure 3: Linking conditions.
  • Figure 4: Average total computation time excluding sort time and full sort time vs $n$. All results are averaged over 100 instances, except for methods GRID, GRBS, and GRBU with $n\in\{10^6,10^7\}$ in which a time-limit of $10^4$ seconds is imposed across 2 instances.
  • Figure 5: Computation time relative to ESGS averaged over 100 instances. A value of $c>0$ indicates that ESGS was $c$ times faster than the other method; a value of $c<0$ indicates that the other method was $c$ times faster than ESGS. Across all scenarios at $n=10^5$, the better of GRBS and GRBU was at best $\approx350$ times slower than ESGS (and never faster).

Theorems & Definitions (23)

  • proof
  • proof
  • proof
  • Proposition 1
  • proof
  • lemma 1
  • proof
  • lemma 2: Early stop
  • proof
  • lemma 3: Late start
  • ...and 13 more