Universal Bootstrap for Spectral Statistics: Beyond Gaussian Approximation

Guoyu Zhang; Dandan Jiang; Fang Yao

Universal Bootstrap for Spectral Statistics: Beyond Gaussian Approximation

Guoyu Zhang, Dandan Jiang, Fang Yao

TL;DR

This work addresses the challenge of deriving distributions for spectral statistics in high- and ultra-high-dimensional covariance matrices where standard Gaussian bootstrap methods fail. It introduces a universal bootstrap grounded in random matrix universality, replacing the data with Gaussian samples of the same covariance to approximate distributions of operator-norm and generalized spectral statistics, without requiring eigenvalue decay or low-rank structure. The authors establish universality properties and consistency results, including a Tracy–Widom-type limit for the largest eigenvalue in ultra-high dimensions, and extend the framework to generalized statistics via T^{ExS} with rigorous error bounds. Through simulations and real-data analysis, the approach demonstrates robust size control and improved power across regimes, providing practical tools for high-dimensional covariance inference and sharp simultaneous confidence intervals.

Abstract

Spectral analysis plays a crucial role in high-dimensional statistics, where determining the asymptotic distribution of various spectral statistics remains a challenging task. Due to the difficulties of deriving the analytic form, recent advances have explored data-driven bootstrap methods for this purpose. However, widely used Gaussian approximation-based bootstrap methods, such as the empirical bootstrap and multiplier bootstrap, have been shown to be inconsistent in approximating the distributions of spectral statistics in high-dimensional settings. To address this issue, we propose a universal bootstrap procedure based on the concept of universality from random matrix theory. Our method consistently approximates a broad class of spectral statistics across both high- and ultra-high-dimensional regimes, accommodating scenarios where the dimension-to-sample-size ratio $p/n$ converges to a nonzero constant or diverges to infinity without requiring structural assumptions on the population covariance matrix, such as eigenvalue decay or low effective rank. We showcase this universal bootstrap method for high-dimensional covariance inference. Extensive simulations and a real-world data study support our findings, highlighting the favorable finite sample performance of the proposed universal bootstrap procedure.

Universal Bootstrap for Spectral Statistics: Beyond Gaussian Approximation

TL;DR

Abstract

converges to a nonzero constant or diverges to infinity without requiring structural assumptions on the population covariance matrix, such as eigenvalue decay or low effective rank. We showcase this universal bootstrap method for high-dimensional covariance inference. Extensive simulations and a real-world data study support our findings, highlighting the favorable finite sample performance of the proposed universal bootstrap procedure.

Paper Structure (14 sections, 7 theorems, 28 equations, 2 figures, 2 tables, 2 algorithms)

This paper contains 14 sections, 7 theorems, 28 equations, 2 figures, 2 tables, 2 algorithms.

Introduction
A universality approach: overcoming Gaussianity
A motivating example: covariance inference
Our contributions
Notations and paper organization
Universal bootstrap
Theoretical guarantees
Universality properties
Consistency of universal bootstrap
Consistency of generalized universal bootstrap
Application to covariance inference
Numerical results
Simulation
Data application

Key Result

Theorem 3.1

Under Assumptions ass1, ass2, ass3, ass4, there exists constant $C$ for large enough $n$ and $t\in\mathbb{R}$ such that for any small $\epsilon>0$, When $\gamma=1$, similar results also hold for $\lambda_p(\phi^{-1/2}\bm{M}_n)$.

Figures (2)

Figure 1: Empirical powers of the supremum, the Frobenius, and the proposed operator norm tests with respect to the signal level of the spike setting alternatives under three covariance structures, the sample size $n=100$, $300$, dimension $p=1000$, and the Gaussian data with $2000$ replications.
Figure 2: Empirical powers of the supremum, the Frobenius, and the proposed operator norm tests with respect to the signal level of the white noise setting alternatives under three covariance structures, the sample size $n=100$, $300$, dimension $p=1000$, and the Gaussian data with $2000$ replications.

Theorems & Definitions (9)

Theorem 3.1: Universality of the largest eigenvalue
Corollary 1
Definition 1: Admissible pair
Theorem 3.2: Universal bootstrap consistency
Remark
Theorem 3.3: Generalized universal bootstrap consistency
Theorem 4.1
Corollary 2
Theorem 4.2

Universal Bootstrap for Spectral Statistics: Beyond Gaussian Approximation

TL;DR

Abstract

Universal Bootstrap for Spectral Statistics: Beyond Gaussian Approximation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (9)