Online (Non-)Convex Learning via Tempered Optimism

Maxime Haddouche; Olivier Wintenberger; Benjamin Guedj

Online (Non-)Convex Learning via Tempered Optimism

Maxime Haddouche, Olivier Wintenberger, Benjamin Guedj

TL;DR

The paper tackles online learning in dynamic environments with potentially non-convex losses by introducing Tempered Optimism, a framework that gracefully leverages imperfect experts. It develops two main non-convex OT algorithms, OT-OGD and OT-OMD, which interpolate between an expert and the current iterate and achieve dimension-free dynamic regret that scales with the expert path $P_T(\nu)$; these results rely on a $\xi$-approximate optimisation oracle to produce local minima. For convex losses, it crafts explicitly designed experts via Construct and combines them with Adjust in Dynamic Online Gradient Descent (D-OGD), yielding dynamic regret bounds of the form $GP_T(\nu) + c\sqrt{T}$ and favorable robustness properties. Empirically, the approach matches or surpasses traditional online methods on real datasets and a toy quadratic problem, demonstrating practical benefits in noisy or evolving environments. Overall, tempered optimism provides a robust bridge between optimistic online learning and the realities of imperfect, potentially non-convex, decision problems.

Abstract

Optimistic Online Learning aims to exploit experts conveying reliable information to predict the future. However, such implicit optimism may be challenged when it comes to practical crafting of such experts. A fundamental example consists in approximating a minimiser of the current problem and use it as expert. In the context of dynamic environments, such an expert only conveys partially relevant information as it may lead to overfitting. To tackle this issue, we introduce in this work the \emph{optimistically tempered} (OT) online learning framework designed to handle such imperfect experts. As a first contribution, we show that tempered optimism is a fruitful paradigm for Online Non-Convex Learning by proposing simple, yet powerful modification of Online Gradient and Mirror Descent. Second, we derive a second OT algorithm for convex losses and third, evaluate the practical efficiency of tempered optimism on real-life datasets and a toy experiment.

Online (Non-)Convex Learning via Tempered Optimism

TL;DR

; these results rely on a

-approximate optimisation oracle to produce local minima. For convex losses, it crafts explicitly designed experts via Construct and combines them with Adjust in Dynamic Online Gradient Descent (D-OGD), yielding dynamic regret bounds of the form

and favorable robustness properties. Empirically, the approach matches or surpasses traditional online methods on real datasets and a toy quadratic problem, demonstrating practical benefits in noisy or evolving environments. Overall, tempered optimism provides a robust bridge between optimistic online learning and the realities of imperfect, potentially non-convex, decision problems.

Abstract

Paper Structure (39 sections, 25 theorems, 129 equations, 3 figures, 7 algorithms)

This paper contains 39 sections, 25 theorems, 129 equations, 3 figures, 7 algorithms.

Introduction
Online Non-Convex Learning via Tempered Optimism
Framework.
Optimistically Tempered Online Gradient Descent.
A theoretical extension: Optimistically Tempered Online Mirror Descent.
Application: Tracking the best expert in nonconvex problems.
Tempered Optimism for Online Convex Learning
The Adjust algorithm
Dynamic Online Gradient Descent
Experiments
Experiments on real-life datasets
A toy experiment: the Online Quadratic Problem
Conclusion
Technical background
Azuma-Hoeffding's inequality
...and 24 more sections

Key Result

Theorem 2.2

Let $\left( \hat{\mathbf{x}}_{t} \right)_{t\geq 1}$ be the output of alg: OT_OGD with step $\eta>0$ and $\xi$-approximate optimisation oracle $\mathbf{O}$. Then, where the expectation is taken over the sequence $(u_1, \ldots, u_T)$ in alg: OT_OGD.

Figures (3)

Figure 1: Output of Adjust when: (left) performance is positve, (right) performance is negative.
Figure 2: Averaged cumulative losses for all datasets with error bars over 20 runs. The $x$-axis is the time.
Figure 3: Cumulative risks of D-OGD (purple,blue), OMGD (red), OGD (green).

Theorems & Definitions (52)

Definition 2.1
Theorem 2.2
proof
Theorem 2.3
Definition 3.1
Lemma 3.2
Theorem 3.3
Theorem 3.4
Proposition A.1
Definition A.2
...and 42 more

Online (Non-)Convex Learning via Tempered Optimism

TL;DR

Abstract

Online (Non-)Convex Learning via Tempered Optimism

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (52)