Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation

Shivam Gupta; Samuel B. Hopkins; Eric Price

Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation

Shivam Gupta, Samuel B. Hopkins, Eric Price

TL;DR

The paper investigates sharp constants in high-dimensional mean estimation under heavy-tailed and robust noise, anchored by a covariance bound $\\Sigma \preceq \sigma^2 I_d$. It shows that the conventional lifting-based constant $JUNG_d = \sqrt{2d/(d+1)}$ can be strictly improved in the $\\delta$-dominated regime via a new heavy-tailed estimator, with a two-dimensional construction that achieves a $(1-\\tau)$-fraction of the 2D Jung bound and then extends to all dimensions using a generalized Jung bound. In the robust setting, it proves a population-limit lower bound showing the lifting approach is optimal up to constants, matching the folklore+Jung upper bound. Collectively, the results separate constants in heavy-tailed and robust mean estimation, offering practical estimators with improved constants and highlighting open questions about computational efficiency and further constant-tightening across regimes.

Abstract

We study the fundamental problem of estimating the mean of a $d$-dimensional distribution with covariance $Σ\preccurlyeq σ^2 I_d$ given $n$ samples. When $d = 1$, \cite{catoni} showed an estimator with error $(1+o(1)) \cdot σ\sqrt{\frac{2 \log \frac{1}δ}{n}}$, with probability $1 - δ$, matching the Gaussian error rate. For $d>1$, a natural estimator outputs the center of the minimum enclosing ball of one-dimensional confidence intervals to achieve a $1-δ$ confidence radius of $\sqrt{\frac{2 d}{d+1}} \cdot σ\left(\sqrt{\frac{d}{n}} + \sqrt{\frac{2 \log \frac{1}δ}{n}}\right)$, incurring a $\sqrt{\frac{2d}{d+1}}$-factor loss over the Gaussian rate. When the $\sqrt{\frac{d}{n}}$ term dominates by a $\sqrt{\log \frac{1}δ}$ factor, \cite{lee2022optimal-highdim} showed an improved estimator matching the Gaussian rate. This raises a natural question: Is the $\sqrt{\frac{2 d}{d+1}}$ loss \emph{necessary} when the $\sqrt{\frac{2 \log \frac{1}δ}{n}}$ term dominates? We show that the answer is \emph{no} -- we construct an estimator that improves over the above naive estimator by a constant factor. We also consider robust estimation, where an adversary is allowed to corrupt an $ε$-fraction of samples arbitrarily: in this case, we show that the above strategy of combining one-dimensional estimates and incurring the $\sqrt{\frac{2d}{d+1}}$-factor \emph{is} optimal in the infinite-sample limit.

Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation

TL;DR

The paper investigates sharp constants in high-dimensional mean estimation under heavy-tailed and robust noise, anchored by a covariance bound

. It shows that the conventional lifting-based constant

can be strictly improved in the

-dominated regime via a new heavy-tailed estimator, with a two-dimensional construction that achieves a

-fraction of the 2D Jung bound and then extends to all dimensions using a generalized Jung bound. In the robust setting, it proves a population-limit lower bound showing the lifting approach is optimal up to constants, matching the folklore+Jung upper bound. Collectively, the results separate constants in heavy-tailed and robust mean estimation, offering practical estimators with improved constants and highlighting open questions about computational efficiency and further constant-tightening across regimes.

Abstract

We study the fundamental problem of estimating the mean of a

-dimensional distribution with covariance

given

samples. When

, \cite{catoni} showed an estimator with error

, with probability

, matching the Gaussian error rate. For

, a natural estimator outputs the center of the minimum enclosing ball of one-dimensional confidence intervals to achieve a

confidence radius of

, incurring a

-factor loss over the Gaussian rate. When the

term dominates by a

factor, \cite{lee2022optimal-highdim} showed an improved estimator matching the Gaussian rate. This raises a natural question: Is the

loss \emph{necessary} when the

term dominates? We show that the answer is \emph{no} -- we construct an estimator that improves over the above naive estimator by a constant factor. We also consider robust estimation, where an adversary is allowed to corrupt an

-fraction of samples arbitrarily: in this case, we show that the above strategy of combining one-dimensional estimates and incurring the

-factor \emph{is} optimal in the infinite-sample limit.

Paper Structure (49 sections, 30 theorems, 108 equations, 2 figures, 12 algorithms)

This paper contains 49 sections, 30 theorems, 108 equations, 2 figures, 12 algorithms.

Introduction
High-dimensional mean estimation.
Our contributions: heavy-tailed estimation.
Our contributions: robust estimation.
Summary.
Related Work
Heavy-tailed and Robust Estimation.
Towards optimal constants.
Proof Overview
Heavy-Tailed Estimator
High-level goal.
Variant of Catoni's estimator for $d=1$.
A better constant for "inlier-light" distributions.
An alternative to Catoni for outlier-light distributions.
Handling $d=2$.
...and 34 more sections

Key Result

Theorem 1.1

There exists constants $\tau, C > 0$ such that the following holds. Let $d \geq 2$, and suppose $n \geq C \log \frac{1}{\delta} \geq C^2 d$. There is an algorithm that takes $n$ samples from a distribution over $\mathbb{R}^d$ with covariance $\Sigma \preceq \sigma^2 I$, as well as $\sigma^2$ and $\d with $1-\delta$ probability.

Figures (2)

Figure 1: Some $\psi$ functions satisfying Catoni's constraints \ref{['eq:catonirequirement']}
Figure 2: For $d = 2$, the algorithm sees as input the distribution on the left after the adversary corrupts $\varepsilon$-mass. The three distributions to its right are ones consistent with the input.

Theorems & Definitions (55)

Theorem 1.1
Theorem 1.2
Theorem 1.3: Folklore + Jung's theorem
Definition 3.0: $(\beta, L)$-Inlier-Light Distribution
Definition 3.0: $(\beta, L)$-Outlier-Light Distribution
Lemma 3.0: Improved Rate for One-Dimensional Inlier-Light Distributions
Lemma 3.0: Two-dimensional Inlier-Light vs. Outlier-Light Test
Lemma 3.0: Two-Dimensional Estimator for Inlier-Light Distributions
Lemma 3.0: Two-Dimensional Estimator for Outlier-Light Distributions
Theorem 3.1: Final Two-Dimensional Estimator
...and 45 more

Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation

TL;DR

Abstract

Beyond Catoni: Sharper Rates for Heavy-Tailed and Robust Mean Estimation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (55)