Accelerated regularized learning in finite N-person games
Kyriakos Lotidis, Angeliki Giannou, Panayotis Mertikopoulos, Nicholas Bambos
TL;DR
The paper investigates whether Nesterov-style acceleration can improve online learning in finite N-player games. It introduces Follow the Accelerated Leader (FTXL), a momentum-augmented regularized learning scheme that blends NAG-inspired dynamics with regularized best-response maps, and analyzes its discrete- and continuous-time behavior. The key finding is that FTXL attains local superlinear convergence to strict Nash equilibria, yielding an exponential speedup over vanilla regularized learning, and this speedup persists across full-information, realization-based, and bandit payoff-based feedback. The authors provide a concrete discrete-time algorithm, rigorous convergence guarantees under multiple information regimes, and numerical simulations in zero-sum and congestion games, highlighting the practical robustness and potential impact for fast equilibrium learning in multi-agent systems.
Abstract
Motivated by the success of Nesterov's accelerated gradient algorithm for convex minimization problems, we examine whether it is possible to achieve similar performance gains in the context of online learning in games. To that end, we introduce a family of accelerated learning methods, which we call "follow the accelerated leader" (FTXL), and which incorporates the use of momentum within the general framework of regularized learning - and, in particular, the exponential/multiplicative weights algorithm and its variants. Drawing inspiration and techniques from the continuous-time analysis of Nesterov's algorithm, we show that FTXL converges locally to strict Nash equilibria at a superlinear rate, achieving in this way an exponential speed-up over vanilla regularized learning methods (which, by comparison, converge to strict equilibria at a geometric, linear rate). Importantly, FTXL maintains its superlinear convergence rate in a broad range of feedback structures, from deterministic, full information models to stochastic, realization-based ones, and even when run with bandit, payoff-based information, where players are only able to observe their individual realized payoffs.
