From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces

Yuval Dagan; Constantinos Daskalakis; Maxwell Fishelson; Noah Golowich

From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces

Yuval Dagan, Constantinos Daskalakis, Maxwell Fishelson, Noah Golowich

TL;DR

The paper introduces a novel reduction from swap regret to external regret that does not require a finite action space, enabling no-swap-regret guarantees for broad hypothesis classes. The core construction, TreeSwap, uses a depth-$d$, $M$-ary tree of external-regret learners to bound swap regret by at most $\epsilon + 1/d$ after $T=M^d$ rounds, with per-round cost matching the external-regret oracle. This framework yields near-optimal upper and lower bounds for swap regret in the experts setting and extends to infinite and bandit settings, with tight lower bounds against oblivious and adaptive adversaries. The reduction has broad implications for equilibrium computation, providing efficient query/communication protocols for approximate CE/CCE in normal-form and extensive-form games, and it also yields near-tight bandit swap-regret algorithms. The work also clarifies the role of finite Littlestone dimension and related complexity notions in guaranteeing no-swap-regret learning and the existence of approximate correlated equilibria in broader settings.

Abstract

We provide a novel reduction from swap-regret minimization to external-regret minimization, which improves upon the classical reductions of Blum-Mansour [BM07] and Stolz-Lugosi [SL05] in that it does not require finiteness of the space of actions. We show that, whenever there exists a no-external-regret algorithm for some hypothesis class, there must also exist a no-swap-regret algorithm for that same class. For the problem of learning with expert advice, our result implies that it is possible to guarantee that the swap regret is bounded by ε after $\log(N)^{O(1/ε)}$ rounds and with $O(N)$ per iteration complexity, where $N$ is the number of experts, while the classical reductions of Blum-Mansour and Stolz-Lugosi require $O(N/ε^2)$ rounds and at least $Ω(N^2)$ per iteration complexity. Our result comes with an associated lower bound, which -- in contrast to that in [BM07] -- holds for oblivious and $\ell_1$-constrained adversaries and learners that can employ distributions over experts, showing that the number of rounds must be $\tildeΩ(N/ε^2)$ or exponential in $1/ε$. Our reduction implies that, if no-regret learning is possible in some game, then this game must have approximate correlated equilibria, of arbitrarily good approximation. This strengthens the folklore implication of no-regret learning that approximate coarse correlated equilibria exist. Importantly, it provides a sufficient condition for the existence of correlated equilibrium which vastly extends the requirement that the action set is finite, thus answering a question left open by [DG22; Ass+23]. Moreover, it answers several outstanding questions about equilibrium computation and learning in games.

From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces

TL;DR

-ary tree of external-regret learners to bound swap regret by at most

after

rounds, with per-round cost matching the external-regret oracle. This framework yields near-optimal upper and lower bounds for swap regret in the experts setting and extends to infinite and bandit settings, with tight lower bounds against oblivious and adaptive adversaries. The reduction has broad implications for equilibrium computation, providing efficient query/communication protocols for approximate CE/CCE in normal-form and extensive-form games, and it also yields near-tight bandit swap-regret algorithms. The work also clarifies the role of finite Littlestone dimension and related complexity notions in guaranteeing no-swap-regret learning and the existence of approximate correlated equilibria in broader settings.

Abstract

rounds and with

per iteration complexity, where

is the number of experts, while the classical reductions of Blum-Mansour and Stolz-Lugosi require

rounds and at least

per iteration complexity. Our result comes with an associated lower bound, which -- in contrast to that in [BM07] -- holds for oblivious and

-constrained adversaries and learners that can employ distributions over experts, showing that the number of rounds must be

or exponential in

. Our reduction implies that, if no-regret learning is possible in some game, then this game must have approximate correlated equilibria, of arbitrarily good approximation. This strengthens the folklore implication of no-regret learning that approximate coarse correlated equilibria exist. Importantly, it provides a sufficient condition for the existence of correlated equilibrium which vastly extends the requirement that the action set is finite, thus answering a question left open by [DG22; Ass+23]. Moreover, it answers several outstanding questions about equilibrium computation and learning in games.

Paper Structure (57 sections, 38 theorems, 204 equations, 5 algorithms)

This paper contains 57 sections, 38 theorems, 204 equations, 5 algorithms.

Introduction
Swap regret: challenges with large action spaces
Gaps in equilibrium computation.
Main results: near-optimal upper and lower bounds for swap regret
Applications: concrete swap regret bounds.
Applications: equilibrium computation.
Near-matching lower bounds.
Concurrent work.
Proof sketch of the upper bound (\ref{['thm:swap-to-external']})
Updating $\mathtt{Alg}_{\mathsf{Ext}}$ instances.
Swap regret bound.
Proof sketch for the lower bound (\ref{['thm:lower-intro']})
Case 1: $N \geq 4T$.
Case 2: $N < 4T$.
Discussion
...and 42 more sections

Key Result

Theorem 1.1

Let $d, M \in \mathbb{N}$ be given, and suppose that there is a learner for some function class $\mathcal{F}$ which achieves external regret of $\epsilon$ after $M$ iterations. Then there is a learner for $\mathcal{F}$ ($\mathtt{TreeSwap}$; Algorithm alg:treeswap) which achieves a swap regret of at

Theorems & Definitions (74)

Theorem 1.1: Informal version of Theorem \ref{['thm:treeswap']}
Corollary 1.2: Upper bound for finite action swap regret; informal version of Corollary \ref{['cor:treeswap-finiten']}
Corollary 1.3: Swap regret for Littlestone classes; informal version of Corollaries \ref{['cor:ldim-ce']} and \ref{['cor:ce-existence']}
Theorem 1.4: Bandit swap regret; Informal version of \ref{['thm:bandit-tree-swap']}
Corollary 1.5: Query and communication complexity upper bound; informal version of Corollaries \ref{['cor:commce']} and \ref{['cor:queryce']}
Corollary 1.6: Extensive form games; informal version of \ref{['thm:efg-formal']}
Theorem 1.7: Lower bound for swap regret with oblivious adversary; restatement of Corollary \ref{['cor:lower-main']}
Definition 2.1
Definition 2.2
Definition 2.3
...and 64 more

From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces

TL;DR

Abstract

From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (74)