Uniform mean estimation via generic chaining

Daniel Bartl; Shahar Mendelson

Uniform mean estimation via generic chaining

Daniel Bartl, Shahar Mendelson

TL;DR

The paper tackles uniform estimation of the mean $\mathbb{E}\,u(f(X))$ over a rich class $F$ of potentially heavy-tailed functions. It introduces an optimal uniform mean estimator $\Psi$ that combines an optimal univariate mean estimator with Talagrand’s generic chaining, producing a subgaussian-type error bound $\sup_{f\in F} |\Psi - \mathbb{E}\,u(f)| \lesssim R(F) \frac{\mathbb{E} \sup_{f\in F} G_f}{\sqrt{N}}$ under minimal assumptions. Key contributions include the precise assumptions and definitions (isomorphic distance $\mathbbm{\rho}$, $R(F)$, $d_F$, and $D^*(F)$), a complete proof framework, and two major applications: a geometric setting for isotropic log-concave measures and a robust covariance-estimation scenario with adversarial corruption. The results yield optimal uniform mean estimation beyond light-tailed settings and extend to corrupted data, with explicit bounds and practical implications for high-dimensional probability and statistics.

Abstract

We introduce an empirical functional $Ψ$ that is an optimal uniform mean estimator: Let $F\subset L_2(μ)$ be a class of mean zero functions, $u$ is a real valued function, and $X_1,\dots,X_N$ are independent, distributed according to $μ$. We show that under minimal assumptions, with $μ^{\otimes N}$ exponentially high probability, \[ \sup_{f\in F} |Ψ(X_1,\dots,X_N,f) - \mathbb{E} u(f(X))| \leq c R(F) \frac{ \mathbb{E} \sup_{f\in F } |G_f| }{\sqrt N}, \] where $(G_f)_{f\in F}$ is the gaussian processes indexed by $F$ and $R(F)$ is an appropriate notion of `diameter' of the class $\{u(f(X)) : f\in F\}$. The fact that such a bound is possible is surprising, and it leads to the solution of various key problems in high dimensional probability and high dimensional statistics. The construction is based on combining Talagrand's generic chaining mechanism with optimal mean estimation procedures for a single real-valued random variable.

Uniform mean estimation via generic chaining

TL;DR

The paper tackles uniform estimation of the mean

over a rich class

of potentially heavy-tailed functions. It introduces an optimal uniform mean estimator

that combines an optimal univariate mean estimator with Talagrand’s generic chaining, producing a subgaussian-type error bound

under minimal assumptions. Key contributions include the precise assumptions and definitions (isomorphic distance

, and

), a complete proof framework, and two major applications: a geometric setting for isotropic log-concave measures and a robust covariance-estimation scenario with adversarial corruption. The results yield optimal uniform mean estimation beyond light-tailed settings and extend to corrupted data, with explicit bounds and practical implications for high-dimensional probability and statistics.

Abstract

We introduce an empirical functional

that is an optimal uniform mean estimator: Let

be a class of mean zero functions,

is a real valued function, and

are independent, distributed according to

. We show that under minimal assumptions, with

exponentially high probability,

where

is the gaussian processes indexed by

and

is an appropriate notion of `diameter' of the class

. The fact that such a bound is possible is surprising, and it leads to the solution of various key problems in high dimensional probability and high dimensional statistics. The construction is based on combining Talagrand's generic chaining mechanism with optimal mean estimation procedures for a single real-valued random variable.

Uniform mean estimation via generic chaining

TL;DR

Abstract

Uniform mean estimation via generic chaining

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (20)