Table of Contents
Fetching ...

Computable Bernstein Certificates for Cross-Fitted Clipped Covariance Estimation

Even He, Zaizai Yan

TL;DR

This work tackles operator-norm covariance estimation for heavy-tailed data with a fraction of outliers. It introduces a cross-fitted clipped covariance estimator that comes with fully computable Bernstein-type deviation certificates, enabling data-driven clipping via MinUpper while accounting for intrinsic complexity through the effective rank $\\mathbf r(\\Sigma)$. By conditioning on training data, the approach derives a nonasymptotic envelope that holds uniformly over a grid of clipping levels and folds, with guarantees that adapt to the tail regime (finite fourth moments sufficing) and to spiked/low-rank covariance structures. Empirical results on contaminated spiked-covariance benchmarks show stable performance and competitive accuracy across regimes, with substantial speed advantages over iterative robust estimators. The framework provides a principled, certificate-based method for tunable robust covariance estimation in high-dimensional settings, while highlighting a fundamental bias-variance trade-off and the role of intrinsic dimension in scaling.

Abstract

We study operator-norm covariance estimation from heavy-tailed samples that may include a small fraction of arbitrary outliers. A simple and widely used safeguard is \emph{Euclidean norm clipping}, but its accuracy depends critically on an unknown clipping level. We propose a cross-fitted clipped covariance estimator equipped with \emph{fully computable} Bernstein-type deviation certificates, enabling principled data-driven tuning via a selector (\emph{MinUpper}) that balances certified stochastic error and a robust hold-out proxy for clipping bias. The resulting procedure adapts to intrinsic complexity measures such as effective rank under mild tail regularity and retains meaningful guarantees under only finite fourth moments. Experiments on contaminated spiked-covariance benchmarks illustrate stable performance and competitive accuracy across regimes.

Computable Bernstein Certificates for Cross-Fitted Clipped Covariance Estimation

TL;DR

This work tackles operator-norm covariance estimation for heavy-tailed data with a fraction of outliers. It introduces a cross-fitted clipped covariance estimator that comes with fully computable Bernstein-type deviation certificates, enabling data-driven clipping via MinUpper while accounting for intrinsic complexity through the effective rank . By conditioning on training data, the approach derives a nonasymptotic envelope that holds uniformly over a grid of clipping levels and folds, with guarantees that adapt to the tail regime (finite fourth moments sufficing) and to spiked/low-rank covariance structures. Empirical results on contaminated spiked-covariance benchmarks show stable performance and competitive accuracy across regimes, with substantial speed advantages over iterative robust estimators. The framework provides a principled, certificate-based method for tunable robust covariance estimation in high-dimensional settings, while highlighting a fundamental bias-variance trade-off and the role of intrinsic dimension in scaling.

Abstract

We study operator-norm covariance estimation from heavy-tailed samples that may include a small fraction of arbitrary outliers. A simple and widely used safeguard is \emph{Euclidean norm clipping}, but its accuracy depends critically on an unknown clipping level. We propose a cross-fitted clipped covariance estimator equipped with \emph{fully computable} Bernstein-type deviation certificates, enabling principled data-driven tuning via a selector (\emph{MinUpper}) that balances certified stochastic error and a robust hold-out proxy for clipping bias. The resulting procedure adapts to intrinsic complexity measures such as effective rank under mild tail regularity and retains meaningful guarantees under only finite fourth moments. Experiments on contaminated spiked-covariance benchmarks illustrate stable performance and competitive accuracy across regimes.
Paper Structure (50 sections, 15 theorems, 120 equations, 15 tables, 1 algorithm)

This paper contains 50 sections, 15 theorems, 120 equations, 15 tables, 1 algorithm.

Key Result

Theorem 1

With probability at least $1-\delta_{\mathrm{var}}$ (over all data), simultaneously for all folds $k$ and all $\gamma\in\mathcal{G}$, the conditional empirical covariance satisfies: Consequently, defining we have on the same event:

Theorems & Definitions (39)

  • Remark 1: Even fold sizes for paired variance proxies
  • Remark 2: Tighter time-uniform alternative
  • Theorem 1: Uniform variance control (fully data-dependent)
  • proof : Proof sketch
  • Remark 3: Cross-fitting aggregation and parallel overhead
  • Proposition 2: Robust Hold-Out Bias Proxy Bound
  • Proposition 3: Oracle inequality for MinUpper selection
  • Remark 4: Trace-scale slack and spiked regimes
  • Definition 1: $L_4$--$L_2$ Norm Equivalence
  • Theorem 4: Intrinsic-Dimension Variance Scaling at Optimal Scale
  • ...and 29 more