On the $O(1/T)$ Convergence of Alternating Gradient Descent-Ascent in Bilinear Games

Tianlong Nan; Shuvomoy Das Gupta; Garud Iyengar; Christian Kroer

On the $O(1/T)$ Convergence of Alternating Gradient Descent-Ascent in Bilinear Games

Tianlong Nan, Shuvomoy Das Gupta, Garud Iyengar, Christian Kroer

TL;DR

We study AltGDA and SimGDA in two-player zero-sum bilinear games and compare their convergence properties under constraints. We prove that AltGDA with a small constant stepsize achieves an $O(1/T)$ ergodic convergence rate when an interior Nash equilibrium exists, and we establish a local $O(1/T)$ rate for general bilinear games. A Performance Estimation Programming framework based on semidefinite programming is introduced to optimize the stepsize and worst-case rate, indicating potential $O(1/T)$ convergence for finite horizons while SimGDA remains limited to $O(1/ ext{sqrt}(T))$ in similar regimes. Numerical experiments corroborate the theoretical rates and illustrate AltGDA’s practical advantage over SimGDA in constrained minimax problems.

Abstract

We study the alternating gradient descent-ascent (AltGDA) algorithm in two-player zero-sum games. Alternating methods, where players take turns to update their strategies, have long been recognized as simple and practical approaches for learning in games, exhibiting much better numerical performance than their simultaneous counterparts. However, our theoretical understanding of alternating algorithms remains limited, and results are mostly restricted to the unconstrained setting. We show that for two-player zero-sum games that admit an interior Nash equilibrium, AltGDA converges at an $O(1/T)$ ergodic convergence rate when employing a small constant stepsize. This is the first result showing that alternation improves over the simultaneous counterpart of GDA in the constrained setting. For games without an interior equilibrium, we show an $O(1/T)$ local convergence rate with a constant stepsize that is independent of any game-specific constants. In a more general setting, we develop a performance estimation programming (PEP) framework to jointly optimize the AltGDA stepsize along with its worst-case convergence rate. The PEP results indicate that AltGDA may achieve an $O(1/T)$ convergence rate for a finite horizon $T$, whereas its simultaneous counterpart appears limited to an $O(1/\sqrt{T})$ rate.

On the $O(1/T)$ Convergence of Alternating Gradient Descent-Ascent in Bilinear Games

TL;DR

We study AltGDA and SimGDA in two-player zero-sum bilinear games and compare their convergence properties under constraints. We prove that AltGDA with a small constant stepsize achieves an

ergodic convergence rate when an interior Nash equilibrium exists, and we establish a local

rate for general bilinear games. A Performance Estimation Programming framework based on semidefinite programming is introduced to optimize the stepsize and worst-case rate, indicating potential

convergence for finite horizons while SimGDA remains limited to

in similar regimes. Numerical experiments corroborate the theoretical rates and illustrate AltGDA’s practical advantage over SimGDA in constrained minimax problems.

Abstract

ergodic convergence rate when employing a small constant stepsize. This is the first result showing that alternation improves over the simultaneous counterpart of GDA in the constrained setting. For games without an interior equilibrium, we show an

local convergence rate with a constant stepsize that is independent of any game-specific constants. In a more general setting, we develop a performance estimation programming (PEP) framework to jointly optimize the AltGDA stepsize along with its worst-case convergence rate. The PEP results indicate that AltGDA may achieve an

convergence rate for a finite horizon

, whereas its simultaneous counterpart appears limited to an

rate.

On the $O(1/T)$ Convergence of Alternating Gradient Descent-Ascent in Bilinear Games

TL;DR

Abstract

On the $O(1/T)$ Convergence of Alternating Gradient Descent-Ascent in Bilinear Games

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (31)