Table of Contents
Fetching ...

Swap Regret and Correlated Equilibria Beyond Normal-Form Games

Eshwar Ram Arunachaleswaran, Natalie Collina, Yishay Mansour, Mehryar Mohri, Jon Schneider, Balasubramanian Sivan

TL;DR

This paper extends regret minimization beyond normal-form games by introducing profile swap regret for polytope games, showing sublinear regret is necessary and sufficient for non-manipulability against a strategic opponent. It develops efficient learning algorithms via Blackwell approachability to achieve $O(\sqrt{T})$ profile swap regret, and analyzes the resulting equilibrium notions, revealing a gap between mediator-implementable correlated outcomes and profile CE in general polytope games. It further distinguishes game-aware from game-agnostic learning, connects to previous work on swap and $\Phi$-regret, and proves APX-hardness for computing the profile-swap distance in Bayesian settings, while offering semi-separation-based methods for practical efficient learning. The work advances understanding of how to compute and reason about correlated outcomes in structured, non-normal-form games, with implications for mechanism design and decentralized equilibrium computation.

Abstract

Swap regret is a notion that has proven itself to be central to the study of general-sum normal-form games, with swap-regret minimization leading to convergence to the set of correlated equilibria and guaranteeing non-manipulability against a self-interested opponent. However, the situation for more general classes of games -- such as Bayesian games and extensive-form games -- is less clear-cut, with multiple candidate definitions for swap-regret but no known efficiently minimizable variant of swap regret that implies analogous non-manipulability guarantees. In this paper, we present a new variant of swap regret for polytope games that we call ``profile swap regret'', with the property that obtaining sublinear profile swap regret is both necessary and sufficient for any learning algorithm to be non-manipulable by an opponent (resolving an open problem of Mansour et al., 2022). Although we show profile swap regret is NP-hard to compute given a transcript of play, we show it is nonetheless possible to design efficient learning algorithms that guarantee at most $O(\sqrt{T})$ profile swap regret. Finally, we explore the correlated equilibrium notion induced by low-profile-swap-regret play, and demonstrate a gap between the set of outcomes that can be implemented by this learning process and the set of outcomes that can be implemented by a third-party mediator (in contrast to the situation in normal-form games).

Swap Regret and Correlated Equilibria Beyond Normal-Form Games

TL;DR

This paper extends regret minimization beyond normal-form games by introducing profile swap regret for polytope games, showing sublinear regret is necessary and sufficient for non-manipulability against a strategic opponent. It develops efficient learning algorithms via Blackwell approachability to achieve profile swap regret, and analyzes the resulting equilibrium notions, revealing a gap between mediator-implementable correlated outcomes and profile CE in general polytope games. It further distinguishes game-aware from game-agnostic learning, connects to previous work on swap and -regret, and proves APX-hardness for computing the profile-swap distance in Bayesian settings, while offering semi-separation-based methods for practical efficient learning. The work advances understanding of how to compute and reason about correlated outcomes in structured, non-normal-form games, with implications for mechanism design and decentralized equilibrium computation.

Abstract

Swap regret is a notion that has proven itself to be central to the study of general-sum normal-form games, with swap-regret minimization leading to convergence to the set of correlated equilibria and guaranteeing non-manipulability against a self-interested opponent. However, the situation for more general classes of games -- such as Bayesian games and extensive-form games -- is less clear-cut, with multiple candidate definitions for swap-regret but no known efficiently minimizable variant of swap regret that implies analogous non-manipulability guarantees. In this paper, we present a new variant of swap regret for polytope games that we call ``profile swap regret'', with the property that obtaining sublinear profile swap regret is both necessary and sufficient for any learning algorithm to be non-manipulable by an opponent (resolving an open problem of Mansour et al., 2022). Although we show profile swap regret is NP-hard to compute given a transcript of play, we show it is nonetheless possible to design efficient learning algorithms that guarantee at most profile swap regret. Finally, we explore the correlated equilibrium notion induced by low-profile-swap-regret play, and demonstrate a gap between the set of outcomes that can be implemented by this learning process and the set of outcomes that can be implemented by a third-party mediator (in contrast to the situation in normal-form games).

Paper Structure

This paper contains 47 sections, 39 theorems, 63 equations, 1 algorithm.

Key Result

Theorem 1

Fix a polytope game $G$ with $\mathcal{X} = \Delta_m$. Then for any transcript of play ${\mathbf x} = (x_1, x_2, \dots, x_T)$ and ${\mathbf y} = (y_1, y_2, \dots, y_T)$ in $G$, we have that (Note that for normal-form games, the vertex game is identical to the original game, and therefore the quantity $\mathop{\mathrm{\mathsf{NFSwapReg}}}\nolimits({\mathbf x}, {\mathbf y})$ is well-defined).

Theorems & Definitions (72)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Theorem 7
  • Theorem 8
  • Lemma 9
  • proof
  • ...and 62 more