A note on continuous-time online learning

Lexing Ying

A note on continuous-time online learning

Lexing Ying

TL;DR

This paper develops continuous-time formulations for online learning problems including online linear optimization, adversarial bandits, and adversarial linear bandits. It employs Legendre transforms and Ito's lemma to derive concise, optimal regret bounds across these problems, demonstrating that continuous-time analysis can reproduce and sometimes strengthen discrete-time results. Notable findings include a continuous-time regret bound of $R \le \beta^{-1}\ln d$ for online linear optimization (which vanishes as $\beta\to\infty$) and bounds of $R=\sqrt{2Td\ln d}$ for adversarial bandits and $R=\sqrt{2Td\ln k}$ for adversarial linear bandits. Overall, the work provides a unifying, concise framework for continuous-time online learning with potential extensions to a broad class of problems.

Abstract

In online learning, the data is provided in a sequential order, and the goal of the learner is to make online decisions to minimize overall regrets. This note is concerned with continuous-time models and algorithms for several online learning problems: online linear optimization, adversarial bandit, and adversarial linear bandit. For each problem, we extend the discrete-time algorithm to the continuous-time setting and provide a concise proof of the optimal regret bound.

A note on continuous-time online learning

TL;DR

for online linear optimization (which vanishes as

) and bounds of

for adversarial bandits and

for adversarial linear bandits. Overall, the work provides a unifying, concise framework for continuous-time online learning with potential extensions to a broad class of problems.

Abstract

Paper Structure (6 sections, 3 theorems, 45 equations)

This paper contains 6 sections, 3 theorems, 45 equations.

Introduction
Legendre transform
Online linear optimization
Adversarial bandit
Adversarial linear bandit
Discussions

Key Result

Theorem 1

For any $\beta>0$, the continuous-time regret is bounded by $\beta^{-1} \ln d$.

Theorems & Definitions (8)

Remark 1
Theorem 1
proof
Remark 2
Theorem 2
proof
Theorem 3
proof

A note on continuous-time online learning

TL;DR

Abstract

A note on continuous-time online learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (8)