Accelerated Frank-Wolfe Algorithms: Complementarity Conditions and Sparsity
Dan Garber
TL;DR
The paper tackles accelerating Frank-Wolfe methods for constrained convex optimization over two main constraint families: polytopes and matrix domains such as the spectrahedron and the unit nuclear-norm ball. It introduces two accelerated FW schemes built around AFISTA: a purely LOO-based approach for polytopes and a hybrid FW method with sparse projections for matrix domains, both achieving a $O(1/\sqrt{\epsilon})$ rate in FO oracle calls after a finite burn-in. Central to the analysis are complementarity conditions that bound the active dimension (face dimension or rank) of optimal solutions, enabling dimension- and rank-independent complexity once the highly sparse regime is reached. Numerical experiments on a convex quadratic over a unit simplex corroborate the theoretical improvements, highlighting substantial gains in outer-iteration efficiency and reduced reliance on full-dimension projections in sparse regimes.
Abstract
We develop new accelerated first-order algorithms in the Frank-Wolfe (FW) family for minimizing smooth convex functions over compact convex sets, with a focus on two prominent constraint classes: (1) polytopes and (2) matrix domains given by the spectrahedron and the unit nuclear-norm ball. A key technical ingredient is a complementarity condition that captures solution sparsity -- face dimension for polytopes and rank for matrices. We present two algorithms: (1) a purely linear optimization oracle (LOO) method for polytopes that has optimal worst-case first-order (FO) oracle complexity and, aside of a finite \emph{burn-in} phase and up to a logarithmic factor, has LOO complexity that scales with $r/\sqrtε$, where $ε$ is the target accuracy and $r$ is the solution sparsity $r$ (independently of the ambient dimension), and (2) a hybrid scheme that combines FW with a sparse projection oracle (e.g., low-rank SVDs for matrix domains with low-rank solutions), which also has optimal FO oracle complexity, and after a finite burn-in phase, only requires $O(1/\sqrtε)$ sparse projections and LOO calls (independently of both the ambient dimension and the rank of optimal solutions). Our results close a gap on how to accelerate recent advancements in linearly-converging FW algorithms for strongly convex optimization, without paying the price of the dimension.
