Tracy-Widom, Gaussian, and Bootstrap: Approximations for Leading Eigenvalues in High-Dimensional PCA
Nina Dörnemann, Miles E. Lopes
TL;DR
The paper addresses the problem of identifying whether the leading eigenvalue fluctuations in high-dimensional PCA follow Tracy-Widom behavior (subcritical) or Gaussian fluctuations (supercritical). It introduces a hypothesis test based on $T_n = \frac{n^{2/3}}{\widehat{\sigma}_n}(\lambda_1(\widehat{\Sigma})-\lambda_2(\widehat{\Sigma}))$, with a consistently estimated scale $\widehat{\sigma}_n$ and a subcritical-consistent bootstrap for functionals of leading eigenvalues. The authors prove asymptotic level control under $\mathsf{H}_{0,n}$ and power consistency under alternatives with $K$ supercritical spikes, and they establish bootstrap consistency in the subcritical regime. Numerical experiments and stock-market data illustrate the approach's superior power over gap-ratio methods and its practical relevance for high-dimensional inference in PCA.
Abstract
Under certain conditions, the largest eigenvalue of a sample covariance matrix undergoes a well-known phase transition when the sample size $n$ and data dimension $p$ diverge proportionally. In the subcritical regime, this eigenvalue has fluctuations of order $n^{-2/3}$ that can be approximated by a Tracy-Widom distribution, while in the supercritical regime, it has fluctuations of order $n^{-1/2}$ that can be approximated with a Gaussian distribution. However, the statistical problem of determining which regime underlies a given dataset is far from resolved. We develop a new testing framework and procedure to address this problem. In particular, we demonstrate that the procedure has an asymptotically controlled level, and that it is power consistent for certain alternatives. Also, this testing procedure enables the design a new bootstrap method for approximating the distributions of functionals of the leading sample eigenvalues within the subcritical regime -- which is the first such method that is supported by theoretical guarantees.
