Near-optimal Swap Regret Minimization for Convex Losses
Lunjia Hu, Jon Schneider, Yifan Wu
TL;DR
The paper resolves an open question on swap regret minimization for sequences of convex, Lipschitz losses in $[0,1]$ by delivering an efficient online algorithm with $\mathbb E[\mathsf{SR}]=O(\sqrt T\log T)$ and high-probability bounds of $O\big(\sqrt{(T\log T)\log(1/\delta)}\big)$. The core techniques are multi-scale binning and a V-shaped decomposition, enabling a reduction to base losses and a balanced trade-off between sampling and rounding errors. An efficient multi-objective learning framework (AMF) and the MsMwC expert algorithm yield a poly$(T)$-time predictor that achieves the near-optimal regret, and the results extend to calibration for elicitable properties, including mean, median, and quantiles. This advances online calibration and learning in games by providing near-optimal, scalable guarantees for continuous action spaces under adversarial convex losses. The practical impact spans calibration of predictive distributions and downstream decision making that depend on robust, transform-invariant regret guarantees.
Abstract
We give a randomized online algorithm that guarantees near-optimal $\widetilde O(\sqrt T)$ expected swap regret against any sequence of $T$ adaptively chosen Lipschitz convex losses on the unit interval. This improves the previous best bound of $\widetilde O(T^{2/3})$ and answers an open question of Fishelson et al. [2025b]. In addition, our algorithm is efficient: it runs in $\mathsf{poly}(T)$ time. A key technical idea we develop to obtain this result is to discretize the unit interval into bins at multiple scales of granularity and simultaneously use all scales to make randomized predictions, which we call multi-scale binning and may be of independent interest. A direct corollary of our result is an efficient online algorithm for minimizing the calibration error for general elicitable properties. This result does not require the Lipschitzness assumption of the identification function needed in prior work, making it applicable to median calibration, for which we achieve the first $\widetilde O(\sqrt T)$ calibration error guarantee.
