Inference on testing the number of spikes in a high-dimensional generalized spiked Fisher matrix

Rui Wang; Dandan Jiang

Inference on testing the number of spikes in a high-dimensional generalized spiked Fisher matrix

Rui Wang, Dandan Jiang

TL;DR

The paper tackles testing the number of spikes in a high-dimensional generalized spiked Fisher matrix under a two-sample framework without assuming Gaussianity or diagonal covariance. It introduces a universal test statistic based on partial linear spectral statistics and proves a central limit theorem under the null, enabling spike-count testing. The method is then applied to two practical problems: identifying the number of significant variables in large-dimensional linear regression and detecting change points in sequence data, with explicit CLTs and practical algorithms for each case. Extensive simulations across diverse settings and an empirical study on macroeconomic data demonstrate robust size, power, and real-world effectiveness. This work broadens spike-testing tools beyond classical diagonal and Gaussian assumptions, offering a flexible, theory-grounded approach for modern high-dimensional inference.

Abstract

The spiked Fisher matrix is a significant topic for two-sample problems in multivariate statistical inference. This paper is dedicated to testing the number of spikes in a high-dimensional generalized spiked Fisher matrix that relaxes the Gaussian population assumption and the diagonal constraints on the population covariance matrices. First, we propose a general test statistic predicated on partial linear spectral statistics to test the number of spikes, then establish the central limit theorem (CLT) for this statistic under the null hypothesis. Second, we apply the CLT to address two statistical problems: variable selection in high-dimensional linear regression and change point detection. For each test problem, we construct new statistics and derive their asymptotic distributions under the null hypothesis. Finally, simulations and empirical analysis are conducted to demonstrate the remarkable effectiveness and generality of our proposed methods across various scenarios.

Inference on testing the number of spikes in a high-dimensional generalized spiked Fisher matrix

TL;DR

Abstract

Paper Structure (14 sections, 4 theorems, 51 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 14 sections, 4 theorems, 51 equations, 6 figures, 4 tables, 1 algorithm.

Introduction
Testing the number of spikes for a generalized spiked Fisher matrix
Model formulation
Test on the number of spikes: the general case
The low-rank case of $\mathbf \Sigma_1-\mathbf \Sigma_2$
Application to large-dimensional linear regression models
Application to change point detection
Numerical and empirical studies
Evaluation for the CLT proposed in Theorem 2.1
Numerical studies for large-dimensional linear regression
Numerical studies for change point detection
Empirical study
Conclusion
Proof of Theorem 2.1

Key Result

Theorem 2.1

For the hypothesis testing problem H1, suppose that Assumptions assum1 to assum3 are satisfied. Then, under the null, the test statistic newtest2 follows where The mean and variance terms, $\mu_{ f,H}$ and $\nu_{f,H}$, are functions of $m_0$ and their expressions can be found in the equations mean and var.

Figures (6)

Figure 1: Empirical distribution of $T_{f}$ under the null in Model 1 when $f(x)=\log x$.
Figure 2: Empirical distribution of $T_{f}$ under the null in Model 2 when $f(x)=\log x$.
Figure 3: Empirical distribution of $T_{l}$ under the null in Model 3.
Figure 4: Empirical distribution of $T_{l}$ under the null in Model 4.
Figure 5: Accuracy comparisons among different methods in Models 5 and 6. The mark F indicates that the method fails.
...and 1 more figures

Theorems & Definitions (8)

Theorem 2.1
Corollary 2.1
Example 1
Example 2
Theorem 3.1
Corollary 4.1
Remark 1
proof

Inference on testing the number of spikes in a high-dimensional generalized spiked Fisher matrix

TL;DR

Abstract

Inference on testing the number of spikes in a high-dimensional generalized spiked Fisher matrix

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (8)