Online (Non-)Convex Learning via Tempered Optimism
Maxime Haddouche, Olivier Wintenberger, Benjamin Guedj
TL;DR
The paper tackles online learning in dynamic environments with potentially non-convex losses by introducing Tempered Optimism, a framework that gracefully leverages imperfect experts. It develops two main non-convex OT algorithms, OT-OGD and OT-OMD, which interpolate between an expert and the current iterate and achieve dimension-free dynamic regret that scales with the expert path $P_T(\nu)$; these results rely on a $\xi$-approximate optimisation oracle to produce local minima. For convex losses, it crafts explicitly designed experts via Construct and combines them with Adjust in Dynamic Online Gradient Descent (D-OGD), yielding dynamic regret bounds of the form $GP_T(\nu) + c\sqrt{T}$ and favorable robustness properties. Empirically, the approach matches or surpasses traditional online methods on real datasets and a toy quadratic problem, demonstrating practical benefits in noisy or evolving environments. Overall, tempered optimism provides a robust bridge between optimistic online learning and the realities of imperfect, potentially non-convex, decision problems.
Abstract
Optimistic Online Learning aims to exploit experts conveying reliable information to predict the future. However, such implicit optimism may be challenged when it comes to practical crafting of such experts. A fundamental example consists in approximating a minimiser of the current problem and use it as expert. In the context of dynamic environments, such an expert only conveys partially relevant information as it may lead to overfitting. To tackle this issue, we introduce in this work the \emph{optimistically tempered} (OT) online learning framework designed to handle such imperfect experts. As a first contribution, we show that tempered optimism is a fruitful paradigm for Online Non-Convex Learning by proposing simple, yet powerful modification of Online Gradient and Mirror Descent. Second, we derive a second OT algorithm for convex losses and third, evaluate the practical efficiency of tempered optimism on real-life datasets and a toy experiment.
