Efficient $Φ$-Regret Minimization with Low-Degree Swap Deviations in Extensive-Form Games
Brian Hu Zhang, Ioannis Anagnostides, Gabriele Farina, Tuomas Sandholm
TL;DR
The paper studies efficient computation of correlated equilibria in extensive-form games through parameterized regret minimization with sets of deviations that interpolate between external regret and swap regret. It introduces k-mediator deviations and degree-k polynomial swap deviations, proving regret bounds of $N^{O(k)}/\epsilon^2$ rounds for k-mediators and $N^{O((kd)^3)}/\epsilon^2$ rounds for degree-k deviations in extensive-form trees, with favorable scaling when the game tree is balanced. A key technical innovation is replacing hard fixed-point computations with approximate fixed points in expectation, using consistent deviation maps like the behavioral map $\beta$ and Carathéodory map $\gamma$, enabling fully polynomial no-regret learners in many regimes. These results yield a parameterized tractability framework for EF-equilibria and provide faster algorithms for computing EFCE and related equilibria in practical settings, especially under shallow or balanced trees and moderate depth.
Abstract
Recent breakthrough results by Dagan, Daskalakis, Fishelson and Golowich [2023] and Peng and Rubinstein [2023] established an efficient algorithm attaining at most $ε$ swap regret over extensive-form strategy spaces of dimension $N$ in $N^{\tilde O(1/ε)}$ rounds. On the other extreme, Farina and Pipis [2023] developed an efficient algorithm for minimizing the weaker notion of linear-swap regret in $\mathsf{poly}(N)/ε^2$ rounds. In this paper, we develop efficient parameterized algorithms for regimes between these two extremes. We introduce the set of $k$-mediator deviations, which generalize the untimed communication deviations recently introduced by Zhang, Farina and Sandholm [2024] to the case of having multiple mediators, and we develop algorithms for minimizing the regret with respect to this set of deviations in $N^{O(k)}/ε^2$ rounds. Moreover, by relating $k$-mediator deviations to low-degree polynomials, we show that regret minimization against degree-$k$ polynomial swap deviations is achievable in $N^{O(kd)^3}/ε^2$ rounds, where $d$ is the depth of the game, assuming a constant branching factor. For a fixed degree $k$, this is polynomial for Bayesian games and quasipolynomial more broadly when $d = \mathsf{polylog} N$ -- the usual balancedness assumption on the game tree.
