Table of Contents
Fetching ...

Efficient Quadratic Corrections for Frank-Wolfe Algorithms

Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon, Sébastien Designolle, Sebastian Pokutta

TL;DR

The paper addresses efficient projection-free optimization for convex constrained problems by elevating Frank-Wolfe methods with corrective steps on the active set. It introduces Corrective Frank-Wolfe (CFW) and two quadratic corrections (QC-LP and QC-MNP) that operate over the affine hull of the active set, enabling finite-time convergence under suitable conditions. The authors also develop theoretical results for accelerated variants SCG and SOCGS when augmented with quadratic corrections, deriving improved rates, face-identification properties, and convergence guarantees for generalized self-concordant objectives. Empirically, the proposed framework yields substantial speedups across regression, entanglement detection, Birkhoff projections, and tensor completion, validating its practical impact and broad applicability to quadratic subproblems.

Abstract

We develop a Frank-Wolfe algorithm with corrective steps, generalizing previous algorithms including blended conditional gradients, blended pairwise conditional gradients, and fully-corrective Frank-Wolfe. For this, we prove tight convergence guarantees together with an optimal face identification property. Furthermore, we propose two highly efficient corrective steps for convex quadratic objectives based on linear optimization or linear system solving, akin to Wolfe's minimum-norm point, and show that they converge in finite time under suitable conditions. Beyond optimization problems that are directly quadratic, we revisit two algorithms - split conditional gradient and second-order conditional gradient sliding - which can leverage quadratic corrections to accelerate their quadratic subproblems. We demonstrate improved convergence rates for the first and broader applicability for the second, which may be of independent interest. Finally, we show substantial computational speedups for Frank-Wolfe-based algorithms with quadratic corrections across the considered problem classes.

Efficient Quadratic Corrections for Frank-Wolfe Algorithms

TL;DR

The paper addresses efficient projection-free optimization for convex constrained problems by elevating Frank-Wolfe methods with corrective steps on the active set. It introduces Corrective Frank-Wolfe (CFW) and two quadratic corrections (QC-LP and QC-MNP) that operate over the affine hull of the active set, enabling finite-time convergence under suitable conditions. The authors also develop theoretical results for accelerated variants SCG and SOCGS when augmented with quadratic corrections, deriving improved rates, face-identification properties, and convergence guarantees for generalized self-concordant objectives. Empirically, the proposed framework yields substantial speedups across regression, entanglement detection, Birkhoff projections, and tensor completion, validating its practical impact and broad applicability to quadratic subproblems.

Abstract

We develop a Frank-Wolfe algorithm with corrective steps, generalizing previous algorithms including blended conditional gradients, blended pairwise conditional gradients, and fully-corrective Frank-Wolfe. For this, we prove tight convergence guarantees together with an optimal face identification property. Furthermore, we propose two highly efficient corrective steps for convex quadratic objectives based on linear optimization or linear system solving, akin to Wolfe's minimum-norm point, and show that they converge in finite time under suitable conditions. Beyond optimization problems that are directly quadratic, we revisit two algorithms - split conditional gradient and second-order conditional gradient sliding - which can leverage quadratic corrections to accelerate their quadratic subproblems. We demonstrate improved convergence rates for the first and broader applicability for the second, which may be of independent interest. Finally, we show substantial computational speedups for Frank-Wolfe-based algorithms with quadratic corrections across the considered problem classes.

Paper Structure

This paper contains 39 sections, 28 theorems, 140 equations, 11 figures, 2 tables, 11 algorithms.

Key Result

Proposition 0

alg:local_pairwise_step, alg:fcfw, and the simplex gradient descent step from pok18bcg satisfy the criteria of alg:corrective_step.

Figures (11)

  • Figure 1: Sparse regression over the $K$-Sparse polytope for $K \in \{5,20\}$
  • Figure 2: Entanglement detection for $a \in \{0.25, 0.5\}$
  • Figure 3: Projection onto the intersection of the Birkhoff polytope and a shifted $\ell_2$ ball for dimension $n \in \{300, 500\}$
  • Figure 4: Logistic regression over the $\ell_1$-ball for maximum number of inner steps $k \in \{50,200\}$
  • Figure 5: Sparse regression with $K \in \{3, 30\}$ and $\tau=1$
  • ...and 6 more figures

Theorems & Definitions (42)

  • Proposition 0
  • Remark 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Remark 2
  • Proposition 3
  • Proposition 3
  • Proposition 3
  • Remark 3
  • ...and 32 more