Last-iterate Convergence for Symmetric, General-sum, $2 \times 2$ Games Under The Exponential Weights Dynamic
Guanghui Wang, Krishna Acharya, Lokranjan Lakshmikanthan, Juba Ziani, Vidya Muthukumar
TL;DR
This paper investigates the discrete-time Exponential Weights (EW) dynamic with a fixed learning rate in symmetric $2\\times 2$ general-sum games, showing unprecedented last-iterate convergence to Nash equilibria for all non-degenerate cases. By analyzing the ratio dynamics $r_i^{(t)}$ and the fixed-point structure, the authors derive precise convergence regimes: same-sign $\\epsilon_1$ and $\\epsilon_2$ yield exponential convergence to a pure NE for any $\\eta$, while opposite-sign cases depend on initialization, with potential convergence to a pure or strictly mixed NE, the latter under a step-size bound $\\eta<\\frac{8}{|\\epsilon_1|+|\\epsilon_2|}$. The results reconcile with known oscillations in broader settings by leveraging symmetry and a finite-sign-flip argument, and they are illustrated through congestion-game and performative-prediction applications, including a mortgage-competition bank model. Overall, the work provides sharp, global last-iterate convergence guarantees for the standard EW dynamic in a classical yet rich game-theoretic setting, with implications for learning dynamics in symmetric strategic environments. These findings offer practical insights for designing stable, iterative learning rules in competitive settings and suggest avenues for extending the analysis beyond $2\\times 2$ action spaces.
Abstract
We conduct a comprehensive analysis of the discrete-time exponential-weights dynamic with a constant step size on all \emph{general-sum and symmetric} $2 \times 2$ normal-form games, i.e. games with $2$ pure strategies per player, and where the ensuing payoff tuple is of the form $(A,A^\top)$ (where $A$ is the $2 \times 2$ payoff matrix corresponding to the first player). Such symmetric games commonly arise in real-world interactions between "symmetric" agents who have identically defined utility functions -- such as Bertrand competition, multi-agent performative prediction, and certain congestion games -- and display a rich multiplicity of equilibria despite the seemingly simple setting. Somewhat surprisingly, we show through a first-principles analysis that the exponential weights dynamic, which is popular in online learning, converges in the last iterate for such games regardless of initialization with an appropriately chosen step size. For certain games and/or initializations, we further show that the convergence rate is in fact exponential and holds for any step size. We illustrate our theory with extensive simulations and applications to the aforementioned game-theoretic interactions. In the case of multi-agent performative prediction, we formulate a new "mortgage competition" game between lenders (i.e. banks) who interact with a population of customers, and show that it fits into our framework.
