Table of Contents
Fetching ...

High-dimensional Gaussian and bootstrap approximations for robust means

Anders Bredahl Kock, David Preinerstorfer

TL;DR

The paper addresses reliable Gaussian and bootstrap approximations for high-dimensional robust means when the dimension $d$ grows rapidly with the sample size $n$. It introduces data-driven Winsorized and trimmed means that adapt to the (unknown) number of finite moments $m>2$, remain robust to adversarial contamination, and admit Gaussian and bootstrap inference via a novel robust covariance estimator. The key contributions include a rigorous high-dimensional Gaussian approximation for Winsorized means that allows exponential growth in $d$, a PSD-corrected covariance estimator with contamination-robust guarantees, and bootstrap consistency including normalized and trimmed variants. Together, these results provide practical, adaptive inference tools for high-dimensional mean estimation under heavy tails and contamination with broad applicability to multivariate settings and complex hypothesis testing.

Abstract

Recent years have witnessed much progress on Gaussian and bootstrap approximations to the distribution of sums of independent random vectors with dimension $d$ large relative to the sample size $n$. However, for any number of moments $m>2$ that the summands may possess, there exist distributions such that these approximations break down if $d$ grows faster than the polynomial barrier $n^{\frac{m}{2}-1}$. In this paper, we establish Gaussian and bootstrap approximations to the distributions of winsorized and trimmed means that allow $d$ to grow at an exponential rate in $n$ as long as $m>2$ moments exist. The approximations remain valid under some amount of adversarial contamination. Our implementations of the winsorized and trimmed means do not require knowledge of $m$. As a consequence, the performance of the approximation guarantees ``adapts'' to $m$.

High-dimensional Gaussian and bootstrap approximations for robust means

TL;DR

The paper addresses reliable Gaussian and bootstrap approximations for high-dimensional robust means when the dimension grows rapidly with the sample size . It introduces data-driven Winsorized and trimmed means that adapt to the (unknown) number of finite moments , remain robust to adversarial contamination, and admit Gaussian and bootstrap inference via a novel robust covariance estimator. The key contributions include a rigorous high-dimensional Gaussian approximation for Winsorized means that allows exponential growth in , a PSD-corrected covariance estimator with contamination-robust guarantees, and bootstrap consistency including normalized and trimmed variants. Together, these results provide practical, adaptive inference tools for high-dimensional mean estimation under heavy tails and contamination with broad applicability to multivariate settings and complex hypothesis testing.

Abstract

Recent years have witnessed much progress on Gaussian and bootstrap approximations to the distribution of sums of independent random vectors with dimension large relative to the sample size . However, for any number of moments that the summands may possess, there exist distributions such that these approximations break down if grows faster than the polynomial barrier . In this paper, we establish Gaussian and bootstrap approximations to the distributions of winsorized and trimmed means that allow to grow at an exponential rate in as long as moments exist. The approximations remain valid under some amount of adversarial contamination. Our implementations of the winsorized and trimmed means do not require knowledge of . As a consequence, the performance of the approximation guarantees ``adapts'' to .

Paper Structure

This paper contains 16 sections, 20 theorems, 184 equations.

Key Result

Theorem 2.1

Fix $c\in(1,\sqrt{1.5})$, and let Assumption ass:setting be satisfied with $m>2$. If $\varepsilon_n\in(0,1/2)$, with $\varepsilon_n$ as in eq:epsfam, then where $C$ is a constant depending only on $b_1,b_2,c$ and $m$. In particular, $\rho_{n,W}\to 0$ if $\sqrt{n\log(d)}\overline{\eta}_n^{1-\frac{1}{m}}\to 0$ and $\log(d)/n^{\frac{m-2}{5m-2}}\to 0$.

Theorems & Definitions (41)

  • Remark 2.1
  • Theorem 2.1
  • Theorem 3.1
  • Theorem 3.2
  • Remark 3.1: $m\geq 4$
  • Theorem 4.1
  • Theorem 4.2
  • Theorem 5.1
  • Theorem 5.2
  • Lemma B.1
  • ...and 31 more