Table of Contents
Fetching ...

Generalized Linear Spectral Statistics of High-dimensional Sample Covariance Matrices and Its Applications

Yanlin Hu, Qing Yang, Xiao Han

TL;DR

This work develops Generalized Linear Spectral Statistics (GLSS) for high-dimensional sample covariance matrices, introducing $\operatorname{tr}f(\bm{S}_n)\bm{B}_n$ to probe spectral properties with flexible rank-$k_n$ projection. It establishes central limit theorems for GLSS in regimes where $k_n$ is comparable to $n$ or vanishes relative to $n$, including explicit closed-form and general-$(\bm\Sigma_n)$ examples, and provides normalization results to aid practical use. Building on GLSS, the authors propose a Functional Projection approach for eigenspace testing in population-spiked covariance models, achieving asymptotic normality under mild spike assumptions and demonstrating universality even when spike magnitudes are small. Extensive simulations and numerical studies validate the theory, compare against existing methods, and demonstrate robustness and practical viability for high-dimensional spectral inference.

Abstract

In this paper, we introduce the \textbf{G}eneralized \textbf{L}inear \textbf{S}pectral \textbf{S}tatistics (GLSS) of a high-dimensional sample covariance matrix $\bm{S}_n$, denoted as $\operatorname{tr}f(\bm{S}_n)\bm{B}_n$, which effectively captures distinct spectral properties of $\bm{S}_n$ by incorporating an ancillary matrix $\bm{B}_n$ and a test function $f$. The joint asymptotic normality of GLSS associated with different test functions is established under mild assumptions on $\bm{B}_n$ and the underlying distribution, when the dimension $n$ and sample size $N$ are comparable. The convergence rate of GLSS is determined by $\sqrt{{N}/{\operatorname{rank}(\bm{B}_n)}}$. Subsequently, we propose a novel functional projection approach based on GLSS for hypothesis testing on eigenspaces of ``population-spiked'' covariance matrices, showcasing a universality phenomenon in the magnitude of the spikes. The theoretical accuracy of our results established for GLSS and the advantages of the newly suggested testing procedure are demonstrated through various numerical studies.

Generalized Linear Spectral Statistics of High-dimensional Sample Covariance Matrices and Its Applications

TL;DR

This work develops Generalized Linear Spectral Statistics (GLSS) for high-dimensional sample covariance matrices, introducing to probe spectral properties with flexible rank- projection. It establishes central limit theorems for GLSS in regimes where is comparable to or vanishes relative to , including explicit closed-form and general- examples, and provides normalization results to aid practical use. Building on GLSS, the authors propose a Functional Projection approach for eigenspace testing in population-spiked covariance models, achieving asymptotic normality under mild spike assumptions and demonstrating universality even when spike magnitudes are small. Extensive simulations and numerical studies validate the theory, compare against existing methods, and demonstrate robustness and practical viability for high-dimensional spectral inference.

Abstract

In this paper, we introduce the \textbf{G}eneralized \textbf{L}inear \textbf{S}pectral \textbf{S}tatistics (GLSS) of a high-dimensional sample covariance matrix , denoted as , which effectively captures distinct spectral properties of by incorporating an ancillary matrix and a test function . The joint asymptotic normality of GLSS associated with different test functions is established under mild assumptions on and the underlying distribution, when the dimension and sample size are comparable. The convergence rate of GLSS is determined by . Subsequently, we propose a novel functional projection approach based on GLSS for hypothesis testing on eigenspaces of ``population-spiked'' covariance matrices, showcasing a universality phenomenon in the magnitude of the spikes. The theoretical accuracy of our results established for GLSS and the advantages of the newly suggested testing procedure are demonstrated through various numerical studies.
Paper Structure (12 sections, 5 theorems, 59 equations, 11 figures, 2 tables)

This paper contains 12 sections, 5 theorems, 59 equations, 11 figures, 2 tables.

Key Result

Theorem 2.2

[$k_n$ is comparable to $n$]. Suppose that Assumptions asa, asb and asc (i) hold. Let $f_1, \ldots, f_r$ be analytic functions on an open interval containing $[d_{-},d^{+}]$, where Recall the definition of GLSS in core and define where $\Gamma$ is a contour taken in the positive direction enclosing an open interval covering $[d_{-},d^{+}]$. Then we have the following results: (i) the random vect

Figures (11)

  • Figure 1: Model 1: (a): Histogram of the records $\left(\widetilde{\Theta}_n^1(f),\cdots,\widetilde{\Theta}_n^M(f)\right)$ with $X_{ij}\sim\mathcal{N}(0,1)$ and density curve of $\mathcal{N}(0,1)$ (blue line) (b): QQ-plot of the records.
  • Figure 2: Model 2: (a): Histogram of the records $\left(\widetilde{\Theta}_n^1(f),\cdots,\widetilde{\Theta}_n^M(f)\right)$ with $X_{ij}\sim (\text{Gamma}(2,1)-2)/\sqrt{2}$ and density curve of $\mathcal{N}(0,1)$ (blue line) (b): QQ-plot of the records.
  • Figure 3: Model 7: (a): Histogram of the records $\left(\widetilde{\Theta}_n^1(f),\cdots,\widetilde{\Theta}_n^M(f)\right)^{\top}$ with $X_{ij}\sim\mathcal{N}(0,1)$ and density curve of $\mathcal{N}(0,1)$ (blue line) (b): QQ-plot of the records.
  • Figure 4: Model 8: (a): Histogram of the records $\left(\widetilde{\Theta}_n^1(f),\cdots,\widetilde{\Theta}_n^M(f)\right)^{\top}$ with $X_{ij}\sim (\text{Gamma}(2,1)-2)/\sqrt{2}$ and density curve of $\mathcal{N}(0,1)$ (blue line) (b): QQ-plot of the records.
  • Figure 5: Power comparison for Scenario I when $X_{ij}\sim \mathcal{N}(0,1)$. The angle $\varphi$ varies within $\{1\%, 2\%, \cdots, 80\%\}\times \pi/2$. The data dimension $n=500$. The sample size in the left plot (a) is $N=500$, while in the right plot (b) it is $N=1000$. FP$\underline{~}$z2 and FP$\underline{~}$z3 represents our approach FP with $f(z)=z^2$ and $z^3$, respectively.
  • ...and 6 more figures

Theorems & Definitions (14)

  • Definition 2.1
  • Theorem 2.2
  • Proposition 1
  • Remark 2.1
  • Theorem 2.3
  • Remark 2.2
  • Remark 2.3
  • Remark 2.4
  • Theorem 2.4
  • Example 1
  • ...and 4 more