Table of Contents
Fetching ...

A New Convergence Analysis of Two Stochastic Frank-Wolfe Algorithms

Natthawut Boonsiriphatthanajaroen, Shane G. Henderson

TL;DR

This work analyzes stochastic linearly constrained optimization using standard and away-step Frank-Wolfe methods without assuming a finite-sum objective. By combining a Lyapunov function framework with polytope geometry, it derives iteration and sample-size complexity under two gradient-noise regimes: bounded variance and sub-Gaussian tails. The results show that away-step FW achieves significantly faster iteration rates than standard FW, with iteration complexity of $\mathcal{O}(\varepsilon^{-1})$ versus $\mathcal{O}(\varepsilon^{-2})$, and that gradient-sample requirements scale as $\mathcal{O}(\varepsilon^{-2}\log(1/\varepsilon))$ or better under sub-Gaussian noise. The analysis also proves an exponential tail for the stopping time and provides explicit constants linking geometry (pyramidal width) and problem parameters to convergence, informing practical sampling decisions in simulation optimization scenarios.

Abstract

We study the convergence properties of the original and away-step Frank-Wolfe algorithms for linearly constrained stochastic optimization assuming the availability of unbiased objective function gradient estimates. The objective function is not restricted to a finite summation form, like in previous analyses tailored to machine-learning applications. To enable the use of concentration inequalities we assume either a uniform bound on the variance of gradient estimates or uniformly sub-Gaussian tails on gradient estimates. With one of these regularity assumptions along with sufficient sampling, we can ensure sufficiently accurate gradient estimates. We then use a Lyapunov argument to obtain the desired complexity bounds, relying on existing geometrical results for polytopes.

A New Convergence Analysis of Two Stochastic Frank-Wolfe Algorithms

TL;DR

This work analyzes stochastic linearly constrained optimization using standard and away-step Frank-Wolfe methods without assuming a finite-sum objective. By combining a Lyapunov function framework with polytope geometry, it derives iteration and sample-size complexity under two gradient-noise regimes: bounded variance and sub-Gaussian tails. The results show that away-step FW achieves significantly faster iteration rates than standard FW, with iteration complexity of versus , and that gradient-sample requirements scale as or better under sub-Gaussian noise. The analysis also proves an exponential tail for the stopping time and provides explicit constants linking geometry (pyramidal width) and problem parameters to convergence, informing practical sampling decisions in simulation optimization scenarios.

Abstract

We study the convergence properties of the original and away-step Frank-Wolfe algorithms for linearly constrained stochastic optimization assuming the availability of unbiased objective function gradient estimates. The objective function is not restricted to a finite summation form, like in previous analyses tailored to machine-learning applications. To enable the use of concentration inequalities we assume either a uniform bound on the variance of gradient estimates or uniformly sub-Gaussian tails on gradient estimates. With one of these regularity assumptions along with sufficient sampling, we can ensure sufficiently accurate gradient estimates. We then use a Lyapunov argument to obtain the desired complexity bounds, relying on existing geometrical results for polytopes.

Paper Structure

This paper contains 11 sections, 13 theorems, 87 equations, 1 table, 2 algorithms.

Key Result

Lemma 4.1

If we have a good gradient approximation at iteration $k$, then where $\beta_1 = \min\{\frac{\varepsilon}{8LD^2}, \frac{1}{4}\}$.

Theorems & Definitions (31)

  • Definition 4.1
  • Lemma 4.1
  • proof
  • Definition 4.2
  • Lemma 4.2
  • proof
  • Lemma 4.3
  • proof
  • Definition 4.3
  • Definition 4.4
  • ...and 21 more