Sharp analysis of linear ensemble sampling
Arya Akhavan, David Janz, Csaba Szepesvári
TL;DR
This work provides a sharp, Gaussian-perturbation analysis of linear ensemble sampling (ES) in stochastic linear bandits. By representing the Gaussian perturbations as diagonal Gaussian martingale transforms and embedding them into independent Brownian motions with clocked time changes via the Dambis–Dubins–Schwarz theorem, the authors reduce a complicated adaptive exploration problem to a time-uniform exceedance problem for Brownian motions. They prove that with ensemble size $m=Θ(d\log n)$, ES achieves a high-probability regret of order $\tilde{O}(d^{3/2}\sqrt{n})$, closing the gap to Thompson sampling while keeping computational cost similar. A key technical contribution is a time-uniform lower bound on exceedance frequencies for $m$ Brownian motions, which, together with a master regret bound, yields the main regret guarantee. The work also develops a suite of continuous-time tools (DDS embedding, Ornstein–Uhlenbeck time changes) and provides a near-tight ensemble-size lower bound, highlighting the necessity of $m$ growing with $d$ and ruling out too-small ensembles in general. The approach opens avenues for applying continuous-time embeddings to discrete-time learning analyses and suggests potential extensions to non-Gaussian perturbations and nonlinear models.
Abstract
We analyse linear ensemble sampling (ES) with standard Gaussian perturbations in stochastic linear bandits. We show that for ensemble size $m=Θ(d\log n)$, ES attains $\tilde O(d^{3/2}\sqrt n)$ high-probability regret, closing the gap to the Thompson sampling benchmark while keeping computation comparable. The proof brings a new perspective on randomized exploration in linear bandits by reducing the analysis to a time-uniform exceedance problem for $m$ independent Brownian motions. Intriguingly, this continuous-time lens is not forced; it appears natural--and perhaps necessary: the discrete-time problem seems to be asking for a continuous-time solution, and we know of no other way to obtain a sharp ES bound.
