Table of Contents
Fetching ...

A Linearly Convergent Frank-Wolfe-type Method for Smooth Convex Minimization over the Spectrahedron

Dan Garber

TL;DR

This work tackles smooth convex minimization over the spectrahedron $\mathcal{S}^n$ by introducing a Frank-Wolfe-type method that relies solely on rank-one updates. Under quadratic growth and strict complementarity, the algorithm achieves linear convergence after a finite burn-in that is independent of the ambient dimension, while maintaining $O(n^2)$ time per iteration and requiring only the smoothness parameter $\beta$. The method combines standard FW steps with specialized away/drop and randomized pairwise updates to adapt the active face and ensure contraction, achieving linear rates for both rank-one and higher-rank optima. Empirical results on synthetic matrix sensing corroborate the theory, showing linear convergence where standard FW is sublinear, and highlighting robustness to violations of strict complementarity.

Abstract

We consider the problem of minimizing a smooth and convex function over the $n$-dimensional spectrahedron -- the set of real symmetric $n\times n$ positive semidefinite matrices with unit trace, which underlies numerous applications in statistics, machine learning and additional domains. Standard first-order methods often require high-rank matrix computations which are prohibitive when the dimension $n$ is large. The well-known Frank-Wolfe method on the other hand, only requires efficient rank-one matrix computations, however suffers from worst-case slow convergence, even under conditions that enable linear convergence rates for standard methods. In this work we present the first Frank-Wolfe-based algorithm that only applies efficient rank-one matrix computations and, assuming quadratic growth and strict complementarity conditions, is guaranteed, after a finite number of iterations, to converges linearly, in expectation, and independently of the ambient dimension.

A Linearly Convergent Frank-Wolfe-type Method for Smooth Convex Minimization over the Spectrahedron

TL;DR

This work tackles smooth convex minimization over the spectrahedron by introducing a Frank-Wolfe-type method that relies solely on rank-one updates. Under quadratic growth and strict complementarity, the algorithm achieves linear convergence after a finite burn-in that is independent of the ambient dimension, while maintaining time per iteration and requiring only the smoothness parameter . The method combines standard FW steps with specialized away/drop and randomized pairwise updates to adapt the active face and ensure contraction, achieving linear rates for both rank-one and higher-rank optima. Empirical results on synthetic matrix sensing corroborate the theory, showing linear convergence where standard FW is sublinear, and highlighting robustness to violations of strict complementarity.

Abstract

We consider the problem of minimizing a smooth and convex function over the -dimensional spectrahedron -- the set of real symmetric positive semidefinite matrices with unit trace, which underlies numerous applications in statistics, machine learning and additional domains. Standard first-order methods often require high-rank matrix computations which are prohibitive when the dimension is large. The well-known Frank-Wolfe method on the other hand, only requires efficient rank-one matrix computations, however suffers from worst-case slow convergence, even under conditions that enable linear convergence rates for standard methods. In this work we present the first Frank-Wolfe-based algorithm that only applies efficient rank-one matrix computations and, assuming quadratic growth and strict complementarity conditions, is guaranteed, after a finite number of iterations, to converges linearly, in expectation, and independently of the ambient dimension.

Paper Structure

This paper contains 16 sections, 11 theorems, 53 equations, 4 figures, 2 tables, 1 algorithm.

Key Result

Lemma 1

The sequence $({\mathbf{X}}_t)_{t\geq 1}$ produced by Algorithm alg:AFW is always feasible w.r.t. $\mathcal{S}^n$.

Figures (4)

  • Figure 1: Comparison of Frank-Wolfe with line-search and our Algorithm \ref{['alg:AFW']}. We set $n=100$ and $m=15nr^*$.
  • Figure 2: Comparison of different variants of our Algorithm \ref{['alg:AFW']}. We set $n=100$, $r^*=5$, and $m=15nr^*$.
  • Figure 3: Comparison of the Block-FW method and our Algorithm \ref{['alg:AFW']} for $r^* = 10$. We set $n=150$ and $m=20nr^*$.
  • Figure 4: Comparison of Frank-Wolfe with line-search and our Algorithm \ref{['alg:AFW']} in case strict complementarity does not hold. We set $n=100$, $r^*=3$, and $m=10nr^*$.

Theorems & Definitions (19)

  • Lemma 1: feasibility of Algorithm \ref{['alg:AFW']}
  • Lemma 2
  • Theorem 1
  • Theorem 2: convergence of Algorithm \ref{['alg:AFW']}
  • Lemma 3: drop step
  • proof
  • proof
  • Lemma 4: pairwise step
  • proof
  • Lemma 5
  • ...and 9 more