Table of Contents
Fetching ...

A second order regret bound for NormalHedge

Yoav Freund, Nicholas J. A. Harvey, Victor S. Portella, Yabing Qi, Yu-Xiang Wang

TL;DR

This work resolves a long-standing question on adaptive, second-order regret in the prediction-with-expert-advice setting by showing a NormalHedge variant achieves a quantile- and variance-based bound of the form $\mathrm{Regret}_{\varepsilon}(T) = O\left(\sqrt{(t_0+2V_T)\big(\log(t_0+2V_T)+2\log(1/\varepsilon)\big)}\right)$, where $V_T$ is the cumulative second moment of instantaneous regrets under a problem-dependent distribution. The authors develop a CP (constant-potential) Hedge framework using good potentials that obey the backwards heat equation, enabling a discretization-error control via local self-concordance and a continuous-time SDE-inspired interpretation. Their main contribution is proving adaptive, second-order quantile regret for NormalHedge.BH, including a carefully chosen initialization $t_0$ and a lower-bound that clarifies the limits of adaptivity. The results unify a continuous-time stochastic-calculus perspective with a rigorous discrete-time analysis, showing that algorithm-dependent variance measures can yield near-optimal, parameter-free regret bounds and advancing understanding of variance-aware online learning. The work has implications for adaptive algorithms in online decision tasks, where regret against the top fraction of experts can be bounded without tuning to unknown sequence properties, by tying performance to the cumulative second moment $V_T$.

Abstract

We consider the problem of prediction with expert advice for ``easy'' sequences. We show that a variant of NormalHedge enjoys a second-order $ε$-quantile regret bound of $O\big(\sqrt{V_T \log(V_T/ε)}\big) $ when $V_T > \log N$, where $V_T$ is the cumulative second moment of instantaneous per-expert regret averaged with respect to a natural distribution determined by the algorithm. The algorithm is motivated by a continuous time limit using Stochastic Differential Equations. The discrete time analysis uses self-concordance techniques.

A second order regret bound for NormalHedge

TL;DR

This work resolves a long-standing question on adaptive, second-order regret in the prediction-with-expert-advice setting by showing a NormalHedge variant achieves a quantile- and variance-based bound of the form , where is the cumulative second moment of instantaneous regrets under a problem-dependent distribution. The authors develop a CP (constant-potential) Hedge framework using good potentials that obey the backwards heat equation, enabling a discretization-error control via local self-concordance and a continuous-time SDE-inspired interpretation. Their main contribution is proving adaptive, second-order quantile regret for NormalHedge.BH, including a carefully chosen initialization and a lower-bound that clarifies the limits of adaptivity. The results unify a continuous-time stochastic-calculus perspective with a rigorous discrete-time analysis, showing that algorithm-dependent variance measures can yield near-optimal, parameter-free regret bounds and advancing understanding of variance-aware online learning. The work has implications for adaptive algorithms in online decision tasks, where regret against the top fraction of experts can be bounded without tuning to unknown sequence properties, by tying performance to the cumulative second moment .

Abstract

We consider the problem of prediction with expert advice for ``easy'' sequences. We show that a variant of NormalHedge enjoys a second-order -quantile regret bound of when , where is the cumulative second moment of instantaneous per-expert regret averaged with respect to a natural distribution determined by the algorithm. The algorithm is motivated by a continuous time limit using Stochastic Differential Equations. The discrete time analysis uses self-concordance techniques.
Paper Structure (109 sections, 43 theorems, 310 equations, 1 figure, 1 table)

This paper contains 109 sections, 43 theorems, 310 equations, 1 figure, 1 table.

Key Result

Lemma 2

Assume $\phi,\mathcal{D}$ satisfies Definition def:good_potential. Let $t$ be the time variable that CP encounters at any iteration $j$. Then the corresponding quantile regret at that iteration can be bounded as follows.

Figures (1)

  • Figure 1: The Constant Potential Algorithm

Theorems & Definitions (81)

  • Definition 1
  • Lemma 2: Generic regret bound template
  • proof
  • Lemma 3: Exponential potential
  • proof
  • Lemma 4: Normal potential
  • proof
  • Theorem 5: Exponential Weights
  • Theorem 6: NormalHedge.BH
  • Theorem 7: Lower Bound
  • ...and 71 more