Convergence of two-timescale gradient descent ascent dynamics: finite-dimensional and mean-field perspectives
Jing An, Jianfeng Lu
TL;DR
The paper studies two-timescale gradient-descent-ascent dynamics for min-max problems in both finite-dimensional and mean-field (infinite-dimensional) regimes. In the finite-dimensional case, it employs hypocoercivity to obtain exponential convergence for a quadratic game, with the convergence rate depending on the learning-rate ratio $\eta$ and showing near-optimal performance when $\eta \approx 1$. In the mean-field setting, it analyzes entropy-regularized min-max objectives and proves Wasserstein-1 contraction to a mixed Nash equilibrium for a finite range of $\eta$ using a mixed synchronous-reflection coupling, accommodating locally nonconvex-nonconcave objectives. The work also introduces a preconditioning strategy to mitigate extreme $\eta$ sensitivity and offers an averaging-based alternative in the Appendix. Collectively, the results provide guidance on choosing $\eta$ and extend convergence guarantees to both finite and mean-field min-max problems.
Abstract
The two-timescale gradient descent-ascent (GDA) is a canonical gradient algorithm designed to find Nash equilibria in min-max games. We analyze the two-timescale GDA by investigating the effects of learning rate ratios on convergence behavior in both finite-dimensional and mean-field settings. In particular, for finite-dimensional quadratic min-max games, we obtain long-time convergence in near quasi-static regimes through the hypocoercivity method. For mean-field GDA dynamics, we investigate convergence under a finite-scale ratio using a mixed synchronous-reflection coupling technique.
