Table of Contents
Fetching ...

Swap Regret Minimization Through Response-Based Approachability

Ioannis Anagnostides, Gabriele Farina, Maxwell Fishelson, Haipeng Luo, Jon Schneider

TL;DR

This work studies online learning under Phi-regret, focusing on swap regret and its link to correlated equilibria and non-manipulability. It introduces a response-based approachability framework, augmented by geometric preconditioning via the John ellipsoid, to obtain near-optimal bounds: $\mathsf{LinearSwapReg}_T = O(d^{3/2} \sqrt{T})$ for general convex sets and $O(d \sqrt{T})$ for centrally symmetric sets, while also minimizing profile swap regret. A matching information-theoretic lower bound $\Omega(d \sqrt{T})$ establishes the rate as optimal in the centrally symmetric case, and the method extends to swap deviations of polynomial dimension, unifying equilibrium computation with online learning. The results yield faster, scalable algorithms for robust decision-making in games and online environments, and clarify the computational limits of existing approaches such as Gordon et al.'s method. Overall, the work advances efficient computation of strong, manipulation-resistant equilibria in general convex games.

Abstract

We consider the problem of minimizing different notions of swap regret in online optimization. These forms of regret are tightly connected to correlated equilibrium concepts in games, and have been more recently shown to guarantee non-manipulability against strategic adversaries. The only computationally efficient algorithm for minimizing linear swap regret over a general convex set in $\mathbb{R}^d$ was developed recently by Daskalakis, Farina, Fishelson, Pipis, and Schneider (STOC '25). However, it incurs a highly suboptimal regret bound of $Ω(d^4 \sqrt{T})$ and also relies on computationally intensive calls to the ellipsoid algorithm at each iteration. In this paper, we develop a significantly simpler, computationally efficient algorithm that guarantees $O(d^{3/2} \sqrt{T})$ linear swap regret for a general convex set and $O(d \sqrt{T})$ when the set is centrally symmetric. Our approach leverages the powerful response-based approachability framework of Bernstein and Shimkin (JMLR '15) -- previously overlooked in the line of work on swap regret minimization -- combined with geometric preconditioning via the John ellipsoid. Our algorithm simultaneously minimizes profile swap regret, which was recently shown to guarantee non-manipulability. Moreover, we establish a matching information-theoretic lower bound: any learner must incur in expectation $Ω(d \sqrt{T})$ linear swap regret for large enough $T$, even when the set is centrally symmetric. This also shows that the classic algorithm of Gordon, Greenwald, and Marks (ICML '08) is existentially optimal for minimizing linear swap regret, although it is computationally inefficient. Finally, we extend our approach to minimize regret with respect to the set of swap deviations with polynomial dimension, unifying and strengthening recent results in equilibrium computation and online learning.

Swap Regret Minimization Through Response-Based Approachability

TL;DR

This work studies online learning under Phi-regret, focusing on swap regret and its link to correlated equilibria and non-manipulability. It introduces a response-based approachability framework, augmented by geometric preconditioning via the John ellipsoid, to obtain near-optimal bounds: for general convex sets and for centrally symmetric sets, while also minimizing profile swap regret. A matching information-theoretic lower bound establishes the rate as optimal in the centrally symmetric case, and the method extends to swap deviations of polynomial dimension, unifying equilibrium computation with online learning. The results yield faster, scalable algorithms for robust decision-making in games and online environments, and clarify the computational limits of existing approaches such as Gordon et al.'s method. Overall, the work advances efficient computation of strong, manipulation-resistant equilibria in general convex games.

Abstract

We consider the problem of minimizing different notions of swap regret in online optimization. These forms of regret are tightly connected to correlated equilibrium concepts in games, and have been more recently shown to guarantee non-manipulability against strategic adversaries. The only computationally efficient algorithm for minimizing linear swap regret over a general convex set in was developed recently by Daskalakis, Farina, Fishelson, Pipis, and Schneider (STOC '25). However, it incurs a highly suboptimal regret bound of and also relies on computationally intensive calls to the ellipsoid algorithm at each iteration. In this paper, we develop a significantly simpler, computationally efficient algorithm that guarantees linear swap regret for a general convex set and when the set is centrally symmetric. Our approach leverages the powerful response-based approachability framework of Bernstein and Shimkin (JMLR '15) -- previously overlooked in the line of work on swap regret minimization -- combined with geometric preconditioning via the John ellipsoid. Our algorithm simultaneously minimizes profile swap regret, which was recently shown to guarantee non-manipulability. Moreover, we establish a matching information-theoretic lower bound: any learner must incur in expectation linear swap regret for large enough , even when the set is centrally symmetric. This also shows that the classic algorithm of Gordon, Greenwald, and Marks (ICML '08) is existentially optimal for minimizing linear swap regret, although it is computationally inefficient. Finally, we extend our approach to minimize regret with respect to the set of swap deviations with polynomial dimension, unifying and strengthening recent results in equilibrium computation and online learning.
Paper Structure (25 sections, 36 theorems, 86 equations, 1 figure, 4 algorithms)

This paper contains 25 sections, 36 theorems, 86 equations, 1 figure, 4 algorithms.

Key Result

Theorem 1.1

If $\mathcal{P} \subset \mathbb R^d$ is a convex body, there is a computationally efficient algorithm that guarantees $O(d^{3/2} \sqrt{T})$ linear swap regret. If $\mathcal{P}$ is centrally symmetric, this can be improved to $O(d \sqrt{T})$.

Figures (1)

  • Figure 1: Illustration of how \ref{['alg:shimkin']} produces a separating hyperplane (minimax) versus Blackwell's choice (support).

Theorems & Definitions (49)

  • Theorem 1.1
  • Theorem 1.2
  • Proposition 2.1
  • Lemma 3.0
  • Lemma 3.1
  • Lemma 3.1
  • Theorem 3.2
  • Corollary 3.3
  • Lemma 3.3
  • Theorem 3.4: John's Theorem
  • ...and 39 more