Convergence of $\text{log}(1/ε)$ for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis

Ioannis Anagnostides; Tuomas Sandholm

Convergence of $\text{log}(1/ε)$ for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis

Ioannis Anagnostides, Tuomas Sandholm

TL;DR

It is shown that several gradient-based algorithms in the celebrated framework of smoothed analysis have polynomial smoothed complexity, in that their number of iterations grows as a polynomial in the dimensions of the game, $\textsf{log}(1/\epsilon)$ and $1/\sigma$, where $\sigma$ measures the magnitude of the smoothing perturbation.

Abstract

Gradient-based algorithms have shown great promise in solving large (two-player) zero-sum games. However, their success has been mostly confined to the low-precision regime since the number of iterations grows polynomially in $1/ε$, where $ε> 0$ is the duality gap. While it has been well-documented that linear convergence -- an iteration complexity scaling as $\textsf{log}(1/ε)$ -- can be attained even with gradient-based algorithms, that comes at the cost of introducing a dependency on certain condition number-like quantities which can be exponentially large in the description of the game. To address this shortcoming, we examine the iteration complexity of several gradient-based algorithms in the celebrated framework of smoothed analysis, and we show that they have polynomial smoothed complexity, in that their number of iterations grows as a polynomial in the dimensions of the game, $\textsf{log}(1/ε)$, and $1/σ$, where $σ$ measures the magnitude of the smoothing perturbation. Our result applies to optimistic gradient and extra-gradient descent/ascent, as well as a certain iterative variant of Nesterov's smoothing technique. From a technical standpoint, the proof proceeds by characterizing and performing a smoothed analysis of a certain error bound, the key ingredient driving linear convergence in zero-sum games. En route, our characterization also makes a natural connection between the convergence rate of such algorithms and perturbation-stability properties of the equilibrium, which is of interest beyond the model of smoothed complexity.

Convergence of $\text{log}(1/ε)$ for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis

TL;DR

and

, where

measures the magnitude of the smoothing perturbation.

Abstract

, where

is the duality gap. While it has been well-documented that linear convergence -- an iteration complexity scaling as

-- can be attained even with gradient-based algorithms, that comes at the cost of introducing a dependency on certain condition number-like quantities which can be exponentially large in the description of the game. To address this shortcoming, we examine the iteration complexity of several gradient-based algorithms in the celebrated framework of smoothed analysis, and we show that they have polynomial smoothed complexity, in that their number of iterations grows as a polynomial in the dimensions of the game,

, and

, where

measures the magnitude of the smoothing perturbation. Our result applies to optimistic gradient and extra-gradient descent/ascent, as well as a certain iterative variant of Nesterov's smoothing technique. From a technical standpoint, the proof proceeds by characterizing and performing a smoothed analysis of a certain error bound, the key ingredient driving linear convergence in zero-sum games. En route, our characterization also makes a natural connection between the convergence rate of such algorithms and perturbation-stability properties of the equilibrium, which is of interest beyond the model of smoothed complexity.

Convergence of $\text{log}(1/ε)$ for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis

TL;DR

Abstract

Convergence of $\text{log}(1/ε)$ for Gradient-Based Algorithms in Zero-Sum Games without the Condition Number: A Smoothed Analysis

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (48)