Table of Contents
Fetching ...

An Asymptotic Law of the Iterated Logarithm for $\mathrm{KL}_{\inf}$

Ashwin Ram, Aaditya Ramdas

TL;DR

This work identifies the exact almost-sure fluctuation scale of the empirical mean-constrained KL projection $\mathrm{KL}_{\inf}(\widehat{P}_t,\mu)$. By deriving a tight upper bound via an affine information tilt and a matching lower bound through the Donsker–Varadhan variational framework, the authors prove a sharp Law of the Iterated Logarithm: $\limsup_{t\to\infty}\frac{t\,\mathrm{KL}_{\inf}(\widehat{P}_t,\mu)}{\log\log t}=1$ almost surely, with the local quadratic relation $\mathrm{KL}_{\inf}(\widehat{P}_t,\mu)=(1+o(1))\dfrac{(\mu-\widehat{\mu}_t)_+^2}{2\widehat{\sigma}_t^2}$. They extend the result beyond bounded support by introducing time-varying envelopes $[-B_t,B_t]$ with $B_t=o\left(\sqrt{t/\log\log t}\right)$, obtaining the same LIL constant under broad tail regimes (sub-Gaussian, sub-exponential, finite $p>2$ moments) and showing sharpness of the envelope at the $\sqrt{t/\log\log t}$ scale. The findings have direct implications for KL-based bandit indices and sequential testing, clarifying the precise fluctuation scale of information projections in online learning contexts.

Abstract

The population $\mathrm{KL}_{\inf}$ is a fundamental quantity that appears in lower bounds for (asymptotically) optimal regret of pure-exploration stochastic bandit algorithms, and optimal stopping time of sequential tests. Motivated by this, an empirical $\mathrm{KL}_{\inf}$ statistic is frequently used in the design of (asymptotically) optimal bandit algorithms and sequential tests. While nonasymptotic concentration bounds for the empirical $\mathrm{KL}_{\inf}$ have been developed, their optimality in terms of constants and rates is questionable, and their generality is limited (usually to bounded observations). The fundamental limits of nonasymptotic concentration are often described by the asymptotic fluctuations of the statistics. With that motivation, this paper presents a tight (upper and lower) law of the iterated logarithm for empirical $\mathrm{KL}_{\inf}$ applying to extremely general (unbounded) data.

An Asymptotic Law of the Iterated Logarithm for $\mathrm{KL}_{\inf}$

TL;DR

This work identifies the exact almost-sure fluctuation scale of the empirical mean-constrained KL projection . By deriving a tight upper bound via an affine information tilt and a matching lower bound through the Donsker–Varadhan variational framework, the authors prove a sharp Law of the Iterated Logarithm: almost surely, with the local quadratic relation . They extend the result beyond bounded support by introducing time-varying envelopes with , obtaining the same LIL constant under broad tail regimes (sub-Gaussian, sub-exponential, finite moments) and showing sharpness of the envelope at the scale. The findings have direct implications for KL-based bandit indices and sequential testing, clarifying the precise fluctuation scale of information projections in online learning contexts.

Abstract

The population is a fundamental quantity that appears in lower bounds for (asymptotically) optimal regret of pure-exploration stochastic bandit algorithms, and optimal stopping time of sequential tests. Motivated by this, an empirical statistic is frequently used in the design of (asymptotically) optimal bandit algorithms and sequential tests. While nonasymptotic concentration bounds for the empirical have been developed, their optimality in terms of constants and rates is questionable, and their generality is limited (usually to bounded observations). The fundamental limits of nonasymptotic concentration are often described by the asymptotic fluctuations of the statistics. With that motivation, this paper presents a tight (upper and lower) law of the iterated logarithm for empirical applying to extremely general (unbounded) data.
Paper Structure (9 sections, 14 theorems, 108 equations)

This paper contains 9 sections, 14 theorems, 108 equations.

Key Result

theorem 1

Under our above setup we have that, Equivalently, for every $\varepsilon>0$, almost surely there exists a (possibly random) $T_\varepsilon<\infty$ such that for all $t\ge T_\varepsilon$,

Theorems & Definitions (28)

  • theorem 1
  • lemma 1
  • proof : Theorem \ref{['thm:LIL-KLinf']}
  • theorem 2
  • lemma 2
  • proof : Theorem \ref{['thm:LIL-KLinf-equality']}
  • proposition 1
  • theorem 3
  • proof : Proof of Lemma \ref{['lem:Taylor']}
  • proof : Proof of Lemma \ref{['lem:DV-lower']}
  • ...and 18 more