On $O(n)$ Algorithms for Projection onto the Top-$k$-sum Sublevel Set
Jake Roth, Ying Cui
TL;DR
This work addresses the problem of fast, exact Euclidean projection onto the top-$k$-sum sublevel set, a subroutine central to composite superquantile optimization. It introduces two finite-termination solvers that operate in $O(n)$ time for sorted inputs: a parametric-LCP (PLCP) approach leveraging a $Z$-matrix structure and a refined early-stopping grid-search (ESGS) that exploits KKT structure. Unsorted inputs incur an additional $O(n\log n)$ cost from sorting, but partial sorting and online permutation strategies can substantially reduce practical effort in iterative settings. Numerical experiments demonstrate dramatic speedups over grid-search, semismooth Newton, and generic QP solvers, enabling scale to $n=10^7$ with $k=10^4$ in fractions of a second. The methods extend to related vector-$k$-norm projections and offer a practical oracle for large-scale risk-averse optimization problems, with potential applications to sequential projections in iterative methods and to the vector-$k$-norm ball.
Abstract
The \emph{top-$k$-sum} operator computes the sum of the largest $k$ components of a given vector. The Euclidean projection onto the top-$k$-sum sublevel set serves as a crucial subroutine in iterative methods to solve composite superquantile optimization problems. In this paper, we introduce a solver that implements two finite-termination algorithms to compute this projection. Both algorithms have $O(n)$ complexity of floating point operations when applied to a sorted $n$-dimensional input vector, where the absorbed constant is \emph{independent of $k$}. This stands in contrast to an existing grid-search-inspired method that has $O(k(n-k))$ complexity, a partition-based method with $O(n+D\log D)$ complexity, where $D\leq n$ is the number of distinct elements in the input vector, and a semismooth Newon method with a finite termination property but unspecified floating point complexity. The improvement of our methods over the first method is significant when $k$ is linearly dependent on $n$, which is frequently encountered in practical superquantile optimization applications. In instances where the input vector is unsorted, an additional cost is incurred to (partially) sort the vector, whereas a full sort of the input vector seems unavoidable for the other two methods. To reduce this cost, we further derive a rigorous procedure that leverages approximate sorting to compute the projection, which is particularly useful when solving a sequence of similar projection problems. Numerical results show that our methods solve problems of scale $n=10^7$ and $k=10^4$ within $0.05$ seconds, whereas the most competitive alternative, the semismooth Newton-based method, takes about $1$ second. The existing grid-search method and Gurobi's QP solver can take from minutes to hours.
