Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng
TL;DR
This work addresses the elusive question of last-iterate convergence for Regret Matching$^+$-based dynamics in two-player zero-sum games. It introduces ExRM$^+$ and SPRM$^+$, variants grounded in extragradient and optimistic-style updates, and proves they exhibit asymptotic last-iterate convergence in both duality gap and iterates, with a $O(1/\sqrt{t})$ best-iterate rate and, when combined with restarting, linear last-iterate convergence. A Minty-condition-based analysis reveals a geometric structure of limit points, enabling convergence proofs despite non-monotone regret operators. Numerical experiments on matrix games, Kuhn poker, and Goofspiel corroborate the theory, showing substantial improvements over RM$^+$-type methods and demonstrating the practical value of restart schemes. The results offer a fresh variational-inequality perspective on last-iterate convergence for non-monotone operators and pave the way for robust RM$^+$-based solvers in large-scale extensive-form games.
Abstract
We study last-iterate convergence properties of algorithms for solving two-player zero-sum games based on Regret Matching$^+$ (RM$^+$). Despite their widespread use for solving real games, virtually nothing is known about their last-iterate convergence. A major obstacle to analyzing RM-type dynamics is that their regret operators lack Lipschitzness and (pseudo)monotonicity. We start by showing numerically that several variants used in practice, such as RM$^+$, predictive RM$^+$ and alternating RM$^+$, all lack last-iterate convergence guarantees even on a simple $3\times 3$ matrix game. We then prove that recent variants of these algorithms based on a smoothing technique, extragradient RM$^{+}$ and smooth Predictive RM$^+$, enjoy asymptotic last-iterate convergence (without a rate), $1/\sqrt{t}$ best-iterate convergence, and when combined with restarting, linear-rate last-iterate convergence. Our analysis builds on a new characterization of the geometric structure of the limit points of our algorithms, marking a significant departure from most of the literature on last-iterate convergence. We believe that our analysis may be of independent interest and offers a fresh perspective for studying last-iterate convergence in algorithms based on non-monotone operators.
