Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability

Brian Hu Zhang; Ioannis Anagnostides; Emanuel Tewolde; Ratip Emin Berker; Gabriele Farina; Vincent Conitzer; Tuomas Sandholm

Paper

Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability

Abstract

-equilibria -- and the associated notion of

-regret -- are a powerful and flexible framework at the heart of online learning and game theory, whereby enriching the set of deviations

begets stronger notions of rationality. Recently, Daskalakis, Farina, Fishelson, Pipis, and Schneider (STOC '24) -- abbreviated as DFFPS -- settled the existence of efficient algorithms when

contains only linear maps under a general,

-dimensional convex constraint set

. In this paper, we significantly extend their work by resolving the case where

-dimensional; degree-

polynomials constitute a canonical such example with

. In particular, positing only oracle access to

, we obtain two main positive results: i) a

-time algorithm for computing

-approximate

-equilibria in

-player multilinear games, and ii) an efficient online algorithm that incurs average

-regret at most

using

rounds. We also show nearly matching lower bounds in the online learning setting, thereby obtaining for the first time a family of deviations that captures the learnability of

-regret. From a technical standpoint, we extend the framework of DFFPS from linear maps to the more challenging case of maps with polynomial dimension. At the heart of our approach is a polynomial-time algorithm for computing an expected fixed point of any

based on the ellipsoid against hope (EAH) algorithm of Papadimitriou and Roughgarden (JACM '08). In particular, our algorithm for computing

-equilibria is based on executing EAH in a nested fashion -- each step of EAH itself being implemented by invoking a separate call to EAH.