Table of Contents
Fetching ...

Convergence of two-timescale gradient descent ascent dynamics: finite-dimensional and mean-field perspectives

Jing An, Jianfeng Lu

TL;DR

The paper studies two-timescale gradient-descent-ascent dynamics for min-max problems in both finite-dimensional and mean-field (infinite-dimensional) regimes. In the finite-dimensional case, it employs hypocoercivity to obtain exponential convergence for a quadratic game, with the convergence rate depending on the learning-rate ratio $\eta$ and showing near-optimal performance when $\eta \approx 1$. In the mean-field setting, it analyzes entropy-regularized min-max objectives and proves Wasserstein-1 contraction to a mixed Nash equilibrium for a finite range of $\eta$ using a mixed synchronous-reflection coupling, accommodating locally nonconvex-nonconcave objectives. The work also introduces a preconditioning strategy to mitigate extreme $\eta$ sensitivity and offers an averaging-based alternative in the Appendix. Collectively, the results provide guidance on choosing $\eta$ and extend convergence guarantees to both finite and mean-field min-max problems.

Abstract

The two-timescale gradient descent-ascent (GDA) is a canonical gradient algorithm designed to find Nash equilibria in min-max games. We analyze the two-timescale GDA by investigating the effects of learning rate ratios on convergence behavior in both finite-dimensional and mean-field settings. In particular, for finite-dimensional quadratic min-max games, we obtain long-time convergence in near quasi-static regimes through the hypocoercivity method. For mean-field GDA dynamics, we investigate convergence under a finite-scale ratio using a mixed synchronous-reflection coupling technique.

Convergence of two-timescale gradient descent ascent dynamics: finite-dimensional and mean-field perspectives

TL;DR

The paper studies two-timescale gradient-descent-ascent dynamics for min-max problems in both finite-dimensional and mean-field (infinite-dimensional) regimes. In the finite-dimensional case, it employs hypocoercivity to obtain exponential convergence for a quadratic game, with the convergence rate depending on the learning-rate ratio and showing near-optimal performance when . In the mean-field setting, it analyzes entropy-regularized min-max objectives and proves Wasserstein-1 contraction to a mixed Nash equilibrium for a finite range of using a mixed synchronous-reflection coupling, accommodating locally nonconvex-nonconcave objectives. The work also introduces a preconditioning strategy to mitigate extreme sensitivity and offers an averaging-based alternative in the Appendix. Collectively, the results provide guidance on choosing and extend convergence guarantees to both finite and mean-field min-max problems.

Abstract

The two-timescale gradient descent-ascent (GDA) is a canonical gradient algorithm designed to find Nash equilibria in min-max games. We analyze the two-timescale GDA by investigating the effects of learning rate ratios on convergence behavior in both finite-dimensional and mean-field settings. In particular, for finite-dimensional quadratic min-max games, we obtain long-time convergence in near quasi-static regimes through the hypocoercivity method. For mean-field GDA dynamics, we investigate convergence under a finite-scale ratio using a mixed synchronous-reflection coupling technique.

Paper Structure

This paper contains 22 sections, 7 theorems, 136 equations, 1 figure.

Key Result

Lemma 3.1

Given the operator $M$ in (operator:M) and Assumption assump:ag, we have the bounds

Figures (1)

  • Figure 3.1: An illustration of how the least eigenvalue of $\frac{1}{\sqrt{\eta}} D+L$ depends on $\eta$ (chosen to be $\eta = 0.01:0.01:10$). We randomly generate $10\times10$ symmetric semi-definite matrices $Q, R$ to create $D=\bigl[Q00\eta R\bigr]$ and $10\times 10$ matrix $P$ to create $L=\bigl[0P- P^{\top}0\bigr]$. The kink points in the plot are caused by eigenvalue crossings.

Theorems & Definitions (18)

  • Lemma 3.1
  • proof
  • Theorem 3.1
  • proof
  • Claim 4.1
  • proof
  • Lemma 4.2
  • proof
  • Remark 4.3
  • Lemma 4.4
  • ...and 8 more