Proximal Regret and Proximal Correlated Equilibria: A New Tractable Solution Concept for Online Learning and Games
Yang Cai, Constantinos Daskalakis, Haipeng Luo, Chen-Yu Wei, Weiqiang Zheng
TL;DR
The paper introduces proximal regret, a regret notion defined via proximal operators that lies between external and swap regret. It proves that Online Gradient Descent minimizes proximal regret at the optimal $O(\sqrt{T})$ rate for all $\rho$-weakly convex losses ($\rho<1$), implying that GD drives play toward proximal correlated equilibria (PCE), a refinement of CCE. The framework extends to Mirror Descent in the Bregman setup and to Optimistic Gradient Descent, yielding accelerated convergence in smooth convex games and instance-dependent improvements in adversarial settings. These results offer a unified explanation for the strong empirical performance of GD and its variants in online learning and multi-agent systems, and they establish new tractable pathways to stronger equilibrium concepts beyond CCE and standard CE.
Abstract
Learning and computation of equilibria are central problems in game theory, theory of computation, and artificial intelligence. In this work, we introduce proximal regret, a new notion of regret based on proximal operators that lies strictly between external and swap regret. When every player employs a no-proximal-regret algorithm in a general convex game, the empirical distribution of play converges to proximal correlated equilibria (PCE), a refinement of coarse correlated equilibria. Our framework unifies several emerging notions in online learning and game theory-such as gradient equilibrium and semicoarse correlated equilibrium-and introduces new ones. Our main result shows that the classic Online Gradient Descent (GD) algorithm achieves an optimal $O(\sqrt{T})$ bound on proximal regret, revealing that GD, without modification, minimizes a stronger regret notion than external regret. This provides a new explanation for the empirically superior performance of gradient descent in online learning and games. We further extend our analysis to Mirror Descent in the Bregman setting and to Optimistic Gradient Descent, which yields faster convergence in smooth convex games.
