A second order regret bound for NormalHedge

Yoav Freund; Nicholas J. A. Harvey; Victor S. Portella; Yabing Qi; Yu-Xiang Wang

A second order regret bound for NormalHedge

Yoav Freund, Nicholas J. A. Harvey, Victor S. Portella, Yabing Qi, Yu-Xiang Wang

TL;DR

This work resolves a long-standing question on adaptive, second-order regret in the prediction-with-expert-advice setting by showing a NormalHedge variant achieves a quantile- and variance-based bound of the form $\mathrm{Regret}_{\varepsilon}(T) = O\left(\sqrt{(t_0+2V_T)\big(\log(t_0+2V_T)+2\log(1/\varepsilon)\big)}\right)$, where $V_T$ is the cumulative second moment of instantaneous regrets under a problem-dependent distribution. The authors develop a CP (constant-potential) Hedge framework using good potentials that obey the backwards heat equation, enabling a discretization-error control via local self-concordance and a continuous-time SDE-inspired interpretation. Their main contribution is proving adaptive, second-order quantile regret for NormalHedge.BH, including a carefully chosen initialization $t_0$ and a lower-bound that clarifies the limits of adaptivity. The results unify a continuous-time stochastic-calculus perspective with a rigorous discrete-time analysis, showing that algorithm-dependent variance measures can yield near-optimal, parameter-free regret bounds and advancing understanding of variance-aware online learning. The work has implications for adaptive algorithms in online decision tasks, where regret against the top fraction of experts can be bounded without tuning to unknown sequence properties, by tying performance to the cumulative second moment $V_T$.

Abstract

We consider the problem of prediction with expert advice for ``easy'' sequences. We show that a variant of NormalHedge enjoys a second-order $ε$-quantile regret bound of $O\big(\sqrt{V_T \log(V_T/ε)}\big) $ when $V_T > \log N$, where $V_T$ is the cumulative second moment of instantaneous per-expert regret averaged with respect to a natural distribution determined by the algorithm. The algorithm is motivated by a continuous time limit using Stochastic Differential Equations. The discrete time analysis uses self-concordance techniques.

A second order regret bound for NormalHedge

TL;DR

, where

is the cumulative second moment of instantaneous regrets under a problem-dependent distribution. The authors develop a CP (constant-potential) Hedge framework using good potentials that obey the backwards heat equation, enabling a discretization-error control via local self-concordance and a continuous-time SDE-inspired interpretation. Their main contribution is proving adaptive, second-order quantile regret for NormalHedge.BH, including a carefully chosen initialization

and a lower-bound that clarifies the limits of adaptivity. The results unify a continuous-time stochastic-calculus perspective with a rigorous discrete-time analysis, showing that algorithm-dependent variance measures can yield near-optimal, parameter-free regret bounds and advancing understanding of variance-aware online learning. The work has implications for adaptive algorithms in online decision tasks, where regret against the top fraction of experts can be bounded without tuning to unknown sequence properties, by tying performance to the cumulative second moment

Abstract

We consider the problem of prediction with expert advice for ``easy'' sequences. We show that a variant of NormalHedge enjoys a second-order

-quantile regret bound of

when

, where

is the cumulative second moment of instantaneous per-expert regret averaged with respect to a natural distribution determined by the algorithm. The algorithm is motivated by a continuous time limit using Stochastic Differential Equations. The discrete time analysis uses self-concordance techniques.

Paper Structure (109 sections, 43 theorems, 310 equations, 1 figure, 1 table)

This paper contains 109 sections, 43 theorems, 310 equations, 1 figure, 1 table.

Introduction
Related work.
Problem Setup
Learning from Expert Advice.
Hedge algorithms using potential functions
CP is well-defined for good potentials.
Efficient computation for $\Delta t$.
Meaning of $V_T$ as an output of the algorithm.
Two prominent instances of CP
Results
Adaptivity and the Open Problem.
Resolution & "Impossibility".
When is $\bm q$ different from $\bm p$?
A stochastic calculus perspective
The role of the Backwards Heat Equation.
...and 94 more sections

Key Result

Lemma 2

Assume $\phi,\mathcal{D}$ satisfies Definition def:good_potential. Let $t$ be the time variable that CP encounters at any iteration $j$. Then the corresponding quantile regret at that iteration can be bounded as follows.

Figures (1)

Figure 1: The Constant Potential Algorithm

Theorems & Definitions (81)

Definition 1
Lemma 2: Generic regret bound template
proof
Lemma 3: Exponential potential
proof
Lemma 4: Normal potential
proof
Theorem 5: Exponential Weights
Theorem 6: NormalHedge.BH
Theorem 7: Lower Bound
...and 71 more

A second order regret bound for NormalHedge

TL;DR

Abstract

A second order regret bound for NormalHedge

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (81)