Table of Contents
Fetching ...

Full Swap Regret and Discretized Calibration

Maxwell Fishelson, Robert Kleinberg, Princewill Okoroafor, Renato Paes Leme, Jon Schneider, Yifeng Teng

TL;DR

This work develops a framework for minimizing swap regret in high-cardinality, but structurally low-dimensional, games by embedding actions into a d-dimensional space and leveraging online convex optimization. The authors introduce full swap regret and show it can be bounded in ways that depend on convexity and smoothness properties, yielding rates like ält{O}(T^{(d+1)/(d+3)}) in structured games. They unify online forecasting calibration with swap regret, obtaining an ält{O}(T^{1/3}) bound for b2-calibration and an ält{O}( olinebreak \u0060 ext{max}( olinebreak \epsilon T^{1/2}, T^{1/3}) ) for discretized calibration, via an extended Blum–Mansour template. The paper also develops discretization, rounding, and nearly-strongly-convex external-regret techniques, including new algorithms for nearly-strongly-convex losses and a principled pathway from structured games to calibration guarantees with polylogarithmic dependence on the horizon when appropriate structure is present. Overall, the results yield efficient, dimension-dependent, and action-count-independent guarantees for swap regret and calibration in broad online-learning and forecasting contexts, with practical implications for correlated equilibria in large-scale structured games.

Abstract

We study the problem of minimizing swap regret in structured normal-form games. Players have a very large (potentially infinite) number of pure actions, but each action has an embedding into $d$-dimensional space and payoffs are given by bilinear functions of these embeddings. We provide an efficient learning algorithm for this setting that incurs at most $\tilde{O}(T^{(d+1)/(d+3)})$ swap regret after $T$ rounds. To achieve this, we introduce a new online learning problem we call \emph{full swap regret minimization}. In this problem, a learner repeatedly takes a (randomized) action in a bounded convex $d$-dimensional action set $\mathcal{K}$ and then receives a loss from the adversary, with the goal of minimizing their regret with respect to the \emph{worst-case} swap function mapping $\mathcal{K}$ to $\mathcal{K}$. For varied assumptions about the convexity and smoothness of the loss functions, we design algorithms with full swap regret bounds ranging from $O(T^{d/(d+2)})$ to $O(T^{(d+1)/(d+2)})$. Finally, we apply these tools to the problem of online forecasting to minimize calibration error, showing that several notions of calibration can be viewed as specific instances of full swap regret. In particular, we design efficient algorithms for online forecasting that guarantee at most $O(T^{1/3})$ $\ell_2$-calibration error and $O(\max(\sqrt{εT}, T^{1/3}))$ \emph{discretized-calibration} error (when the forecaster is restricted to predicting multiples of $ε$).

Full Swap Regret and Discretized Calibration

TL;DR

This work develops a framework for minimizing swap regret in high-cardinality, but structurally low-dimensional, games by embedding actions into a d-dimensional space and leveraging online convex optimization. The authors introduce full swap regret and show it can be bounded in ways that depend on convexity and smoothness properties, yielding rates like ält{O}(T^{(d+1)/(d+3)}) in structured games. They unify online forecasting calibration with swap regret, obtaining an ält{O}(T^{1/3}) bound for b2-calibration and an ält{O}( olinebreak \u0060 ext{max}( olinebreak \epsilon T^{1/2}, T^{1/3}) ) for discretized calibration, via an extended Blum–Mansour template. The paper also develops discretization, rounding, and nearly-strongly-convex external-regret techniques, including new algorithms for nearly-strongly-convex losses and a principled pathway from structured games to calibration guarantees with polylogarithmic dependence on the horizon when appropriate structure is present. Overall, the results yield efficient, dimension-dependent, and action-count-independent guarantees for swap regret and calibration in broad online-learning and forecasting contexts, with practical implications for correlated equilibria in large-scale structured games.

Abstract

We study the problem of minimizing swap regret in structured normal-form games. Players have a very large (potentially infinite) number of pure actions, but each action has an embedding into -dimensional space and payoffs are given by bilinear functions of these embeddings. We provide an efficient learning algorithm for this setting that incurs at most swap regret after rounds. To achieve this, we introduce a new online learning problem we call \emph{full swap regret minimization}. In this problem, a learner repeatedly takes a (randomized) action in a bounded convex -dimensional action set and then receives a loss from the adversary, with the goal of minimizing their regret with respect to the \emph{worst-case} swap function mapping to . For varied assumptions about the convexity and smoothness of the loss functions, we design algorithms with full swap regret bounds ranging from to . Finally, we apply these tools to the problem of online forecasting to minimize calibration error, showing that several notions of calibration can be viewed as specific instances of full swap regret. In particular, we design efficient algorithms for online forecasting that guarantee at most -calibration error and \emph{discretized-calibration} error (when the forecaster is restricted to predicting multiples of ).

Paper Structure

This paper contains 47 sections, 22 theorems, 54 equations, 1 figure, 3 tables, 6 algorithms.

Key Result

Theorem 1

There exists a learning algorithm for the Learner which incurs at most $\tilde{O}(T^{(d+1)/(d+3)})$ swap regret against any Adversary in any $d$-dimensional structured game. Equivalently, the Learner can guarantee $\epsilon$ per-round swap regret as long as $T = \tilde{\Omega}(1/\epsilon)^{(d+3)/2}$

Figures (1)

  • Figure 1: The $x$-axis shows the discretization parameter $\epsilon$ and the $y$-axis the calibration loss $\mathop{\mathrm{\mathsf{Cal}}}\nolimits$ for three different algorithms: dashed blue (regular swap regret on $1/\epsilon$ actions), dashed black (algorithm in Theorem \ref{['thm:calib_bound']} + rounding) and red (algorithm in Theorem \ref{['thm:discrete_calib_bound']}).

Theorems & Definitions (26)

  • Theorem 1
  • Corollary 2
  • Theorem 3: Informal version of Theorem \ref{['thm:full-main']}
  • Theorem 4: $\ell_2$-Calibration
  • Theorem 5: Discretized $\ell_2$-Calibration
  • Theorem 6: Informal version of Theorem \ref{['thm:disc-main']}
  • Definition 7: $\epsilon$-net
  • Lemma 8: Chapter 4.2 of vershynin2018high
  • Definition 9: $\epsilon$-triangulation
  • Lemma 10
  • ...and 16 more