Spectral analysis of spatial-sign covariance matrices for heavy-tailed data with dependence
Hantao Chen, Cheng Wang
TL;DR
The paper establishes the LSD and a central limit theorem for the spectral statistics of the self-normalized spatial-sign covariance matrix under α-regularly varying heavy-tailed data with general covariance Σ. By leveraging concentration properties of self-normalized random variables and precise quadratic-form analyses, it proves that the MP equation governs the LSD under α≥2 and that a CLT for linear spectral statistics holds under α>4, with explicit mean and covariance structures that depend on the tail index through τ=E[Z^4]. The results extend classical random-matrix theory to robust, self-normalized constructions in heavy-tailed, dependent settings, and are shown to be nearly optimal in tail requirements. These findings provide principled, robust spectral tools for high-dimensional multivariate analysis when data exhibit heavy tails and dependence, with practical implications for robust PCA and related methods.
Abstract
This paper investigates the spectral properties of spatial-sign covariance matrices, a self-normalized version of sample covariance matrices, for data from $α$-regularly varying populations with general covariance structures. By exploiting the elegant properties of self-normalized random variables, we establish the limiting spectral distribution and a central limit theorem for linear spectral statistics. We demonstrate that the Mar{uc}enko-Pastur equation holds under the condition $α\geq 2$, while the central limit theorem for linear spectral statistics is valid for $α>4$, which are shown to be nearly the weakest possible conditions for spatial-sign covariance matrices from heavy-tailed data in the presence of dependence.
