On the necessity of adaptive regularisation:Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls
Emmeran Johnson, David Martínez-Rubio, Ciara Pike-Burke, Patrick Rebeschini
TL;DR
This work analyzes online convex optimization over ℓ_p-balls with p>2, revealing a phase transition in optimal regret between low- and high-dimensional regimes. It shows that FTRL with adaptive regularisation—switching from a uniformly-convex degree-p regulariser in early rounds to a strongly-convex degree-2 regulariser around time t0 ≈ d—achieves anytime-optimal regret without knowledge of the horizon T. A core contribution is a negative result: among separable regularisers, adaptivity is necessary for anytime optimality; fixed separable regularisers cannot achieve the best possible rate in both dimension regimes. The paper also provides lower bounds and bandit-feedback results, demonstrating linear regret in high dimensions and highlighting the fundamental role of dimension and geometry in OCO, with practical implications for designing robust online learners.
Abstract
We study online convex optimization on $\ell_p$-balls in $\mathbb{R}^d$ for $p > 2$. While always sub-linear, the optimal regret exhibits a shift between the high-dimensional setting ($d > T$), when the dimension $d$ is greater than the time horizon $T$ and the low-dimensional setting ($d \leq T$). We show that Follow-the-Regularised-Leader (FTRL) with time-varying regularisation which is adaptive to the dimension regime is anytime optimal for all dimension regimes. Motivated by this, we ask whether it is possible to obtain anytime optimality of FTRL with fixed non-adaptive regularisation. Our main result establishes that for separable regularisers, adaptivity in the regulariser is necessary, and that any fixed regulariser will be sub-optimal in one of the two dimension regimes. Finally, we provide lower bounds which rule out sub-linear regret bounds for the linear bandit problem in sufficiently high-dimension for all $\ell_p$-balls with $p \geq 1$.
