Table of Contents
Fetching ...

When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

Junhyung Lyle Kim, Gauthier Gidel, Anastasios Kyrillidis, Fabian Pedregosa

TL;DR

This work analyzes Momentum Extragradient (MEG) for differentiable games through a polynomial-based lens that ties convergence to the Jacobian spectrum of the game dynamics. By expressing MEG residuals with Chebyshev polynomials and introducing a link function σ(λ), the authors classify spectra into three robust modes and derive exact optimal hyperparameters (h, γ, m) for each. The resulting asymptotic rates show accelerated convergence in all three cases, with Case 1 exhibiting a super-accelerated behavior and Cases 2–3 achieving rates close to or beyond established lower bounds for complex-spectrum problems. Local convergence for non-affine vector fields is established via momentum restarting, and numerical experiments on quadratic games corroborate the theory, highlighting MEG’s superiority over GD, EG, and GDM under the studied spectral conditions.

Abstract

The extragradient method has gained popularity due to its robust convergence properties for differentiable games. Unlike single-objective optimization, game dynamics involve complex interactions reflected by the eigenvalues of the game vector field's Jacobian scattered across the complex plane. This complexity can cause the simple gradient method to diverge, even for bilinear games, while the extragradient method achieves convergence. Building on the recently proven accelerated convergence of the momentum extragradient method for bilinear games \citep{azizian2020accelerating}, we use a polynomial-based analysis to identify three distinct scenarios where this method exhibits further accelerated convergence. These scenarios encompass situations where the eigenvalues reside on the (positive) real line, lie on the real line alongside complex conjugates, or exist solely as complex conjugates. Furthermore, we derive the hyperparameters for each scenario that achieve the fastest convergence rate.

When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

TL;DR

This work analyzes Momentum Extragradient (MEG) for differentiable games through a polynomial-based lens that ties convergence to the Jacobian spectrum of the game dynamics. By expressing MEG residuals with Chebyshev polynomials and introducing a link function σ(λ), the authors classify spectra into three robust modes and derive exact optimal hyperparameters (h, γ, m) for each. The resulting asymptotic rates show accelerated convergence in all three cases, with Case 1 exhibiting a super-accelerated behavior and Cases 2–3 achieving rates close to or beyond established lower bounds for complex-spectrum problems. Local convergence for non-affine vector fields is established via momentum restarting, and numerical experiments on quadratic games corroborate the theory, highlighting MEG’s superiority over GD, EG, and GDM under the studied spectral conditions.

Abstract

The extragradient method has gained popularity due to its robust convergence properties for differentiable games. Unlike single-objective optimization, game dynamics involve complex interactions reflected by the eigenvalues of the game vector field's Jacobian scattered across the complex plane. This complexity can cause the simple gradient method to diverge, even for bilinear games, while the extragradient method achieves convergence. Building on the recently proven accelerated convergence of the momentum extragradient method for bilinear games \citep{azizian2020accelerating}, we use a polynomial-based analysis to identify three distinct scenarios where this method exhibits further accelerated convergence. These scenarios encompass situations where the eigenvalues reside on the (positive) real line, lie on the real line alongside complex conjugates, or exist solely as complex conjugates. Furthermore, we derive the hyperparameters for each scenario that achieve the fastest convergence rate.
Paper Structure (33 sections, 16 theorems, 114 equations, 3 figures, 1 algorithm)

This paper contains 33 sections, 16 theorems, 114 equations, 3 figures, 1 algorithm.

Key Result

Lemma 1

Let $w_t$ be the iterate generated by a first-order method after $t$ iterations, with $v(w) = Aw + b$. Then, there exists a real polynomial $P_t$, of degree at most $t$, satisfying: where $P_t(0) = 1$, and $v(w^\star) = Aw^\star + b =0$.

Figures (3)

  • Figure 1: Convergence rates of MEG in terms of the game Jacobian eigenvalues. The step sizes for MEG, $\textcolor{teal}{h}$ and $\textcolor{Bittersweet}{\gamma}$, and the momentum parameter $\textcolor{Fuchsia}{m}$ are set up according to each case of Theorem \ref{['thm:three-modes']}, illustrating three distinct convergence modes of MEG. For each case, the red line indicates the robust region (c.f., Definition \ref{['def:robust_region']}) where MEG achieves the optimal convergence rate.
  • Figure 2: Illustration of the three spectrum models where MEG achieves accelerated convergence rates.
  • Figure 3: Illustration of the game Jacobian spectra and the performance of different algorithms considered. Jacobian spectrum in the first plot matches $\mathcal{S}_2^\star$ in \ref{['eq:eigen-model']} precisely, while that in the third plot inexactly follows $\mathcal{S}_2^\star$. The second (fourth) plot shows the performance of different algorithms for solving quadratic games in \ref{['eq:quad-loss-1']} with the Jacobian spectrum following the first (third) plot.

Theorems & Definitions (33)

  • Lemma 1: chihara2011introduction
  • Theorem 1: Residual polynomials of MEG and their Chebyshev representation
  • Lemma 2: goujaud2022cyclical
  • Definition 1: Robust region of MEG
  • Remark 1
  • Theorem 2: Asymptotic convergence rate of MEG
  • Theorem 3
  • Remark 2
  • Proposition 1
  • Theorem 4: Case 1
  • ...and 23 more