Table of Contents
Fetching ...

Improved covariance estimation: optimal robustness and sub-Gaussian guarantees under heavy tails

Roberto I. Oliveira, Zoraida F. Rico

TL;DR

An estimator of the covariance matrix $\Sigma$ of random $d$-dimensional vector from an i.i.d. sample of size n can be estimated from the sample with the same high-probability error rates that the sample covariance Matrix achieves in the case of Gaussian data.

Abstract

We present an estimator of the covariance matrix $Σ$ of random $d$-dimensional vector from an i.i.d. sample of size $n$. Our sole assumption is that this vector satisfies a bounded $L^p-L^2$ moment assumption over its one-dimensional marginals, for some $p\geq 4$. Given this, we show that $Σ$ can be estimated from the sample with the same high-probability error rates that the sample covariance matrix achieves in the case of Gaussian data. This holds even though we allow for very general distributions that may not have moments of order $>p$. Moreover, our estimator can be made to be optimally robust to adversarial contamination. This result improves the recent contributions by Mendelson and Zhivotovskiy and Catoni and Giulini, and matches parallel work by Abdalla and Zhivotovskiy (the exact relationship with this last work is described in the paper).

Improved covariance estimation: optimal robustness and sub-Gaussian guarantees under heavy tails

TL;DR

An estimator of the covariance matrix of random -dimensional vector from an i.i.d. sample of size n can be estimated from the sample with the same high-probability error rates that the sample covariance Matrix achieves in the case of Gaussian data.

Abstract

We present an estimator of the covariance matrix of random -dimensional vector from an i.i.d. sample of size . Our sole assumption is that this vector satisfies a bounded moment assumption over its one-dimensional marginals, for some . Given this, we show that can be estimated from the sample with the same high-probability error rates that the sample covariance matrix achieves in the case of Gaussian data. This holds even though we allow for very general distributions that may not have moments of order . Moreover, our estimator can be made to be optimally robust to adversarial contamination. This result improves the recent contributions by Mendelson and Zhivotovskiy and Catoni and Giulini, and matches parallel work by Abdalla and Zhivotovskiy (the exact relationship with this last work is described in the paper).
Paper Structure (27 sections, 19 theorems, 160 equations)

This paper contains 27 sections, 19 theorems, 160 equations.

Key Result

Theorem 1.3

There exists a constant $C>0$ such that the following holds. Fix a confidence parameter $1-\alpha\in (0,1)$, a sample size $n\in\mathbb{N}$ and a contamination parameter $\eta\in [0,1/2)$. Then, there is an estimator (i.e. a measurable function) $\widehat{{\sf E}}_{\star}:(\mathbb{R}^{d})^n \to \mat

Theorems & Definitions (48)

  • Remark 1.1: Parallel work by Abdalla and Zhivotovskiy
  • Theorem 1.3: Main result; proof in § \ref{['sub:finalfinal']}
  • Remark 1.4: Comparison with Mendelson mendelson2021lp
  • Remark 1.5: Trimming vs. soft truncation abdalla2022
  • Remark 1.6: Further remarks on trimming
  • Remark 1.7: On a previous version of Theorem \ref{['thm:main']}
  • Remark 2.1
  • Proposition 2.2: Bernstein-type concentration inequality for Gaussian smoothed process
  • proof
  • Proposition 3.1
  • ...and 38 more