Table of Contents
Fetching ...

How weak are weak factors? Uniform inference for signal strength in signal plus noise models

Anna Bykhovskaya, Vadim Gorin, Sasha Sodin

TL;DR

The paper addresses uniform inference on signal strength in high-dimensional signal-plus-noise settings, tackling strong, weak, and critical regimes. It replaces standard Gaussian edge limits with a universal Airy–Green based framework, introducing the transition function $ ext{T}(oldTheta)$ to construct confidence intervals that hold across all four canonical models: spiked Wigner, spiked covariance, factor models, and canonical correlation analysis. The key contributions include a rigorous edge-perturbation theory yielding a universal asymptotic expansion, a practical CI construction (with known or unknown noise variance) and a bootstrap variant, and extensive empirical demonstrations in macroeconomics and finance. The findings reveal a remarkable universality: despite model differences, the top eigenvalue fluctuations near the spectral edge follow a common transition behavior, enabling robust inference where Gaussian approximations fail. This framework provides a principled, model-agnostic tool for assessing factor informativeness and signal strength in diverse high-dimensional applications.

Abstract

The paper analyzes four classical signal-plus-noise models: the factor model, spiked sample covariance matrices, the sum of a Wigner matrix and a low-rank perturbation, and canonical correlation analysis with low-rank dependencies. The objective is to construct confidence intervals for the signal strength that are uniformly valid across all regimes - strong, weak, and critical signals. We demonstrate that traditional Gaussian approximations fail in the critical regime. Instead, we introduce a universal transitional distribution that enables valid inference across the entire spectrum of signal strengths. The approach is illustrated through applications in macroeconomics and finance.

How weak are weak factors? Uniform inference for signal strength in signal plus noise models

TL;DR

The paper addresses uniform inference on signal strength in high-dimensional signal-plus-noise settings, tackling strong, weak, and critical regimes. It replaces standard Gaussian edge limits with a universal Airy–Green based framework, introducing the transition function to construct confidence intervals that hold across all four canonical models: spiked Wigner, spiked covariance, factor models, and canonical correlation analysis. The key contributions include a rigorous edge-perturbation theory yielding a universal asymptotic expansion, a practical CI construction (with known or unknown noise variance) and a bootstrap variant, and extensive empirical demonstrations in macroeconomics and finance. The findings reveal a remarkable universality: despite model differences, the top eigenvalue fluctuations near the spectral edge follow a common transition behavior, enabling robust inference where Gaussian approximations fail. This framework provides a principled, model-agnostic tool for assessing factor informativeness and signal strength in diverse high-dimensional applications.

Abstract

The paper analyzes four classical signal-plus-noise models: the factor model, spiked sample covariance matrices, the sum of a Wigner matrix and a low-rank perturbation, and canonical correlation analysis with low-rank dependencies. The objective is to construct confidence intervals for the signal strength that are uniformly valid across all regimes - strong, weak, and critical signals. We demonstrate that traditional Gaussian approximations fail in the critical regime. Instead, we introduce a universal transitional distribution that enables valid inference across the entire spectrum of signal strengths. The approach is illustrated through applications in macroeconomics and finance.

Paper Structure

This paper contains 39 sections, 46 theorems, 249 equations, 6 figures, 6 tables.

Key Result

Proposition 1

Suppose that all $\theta_i$ are distinct and ordered $\theta_1>\theta_2>\dots>\theta_r$, $\sigma^2=1$. Let $\lambda_1\ge \lambda_2\dots\ge \lambda_N$ denote the eigenvalues of $\mathbf A$ sampled from eq_Spiked_Wigner with $\sigma^2=1$. Denote For each $1\le i \le r$, if $\theta_i>\theta^c$, then as $N\to\infty$, in the sense of convergence in distribution and the Gaussian limits $\mathcal{N}\bi

Figures (6)

  • Figure 1: Confidence intervals for $\theta$ as functions of the observed largest eigenvalue via Gaussian approximations and via our procedure of Section \ref{['Section_confidence_intervals']}.
  • Figure 2: Quantiles of $\mathcal{T}(\Theta)$ from Corollary \ref{['Corollary_uniform_asymptotics']} and from the Gaussian approximations based on Proposition \ref{['Proposition_Transition_to_Gauss']}.
  • Figure 3: IP sample correlation eigenvalues: signals, their $95\%$ confidence intervals, noise, and Marchenko-Pastur distribution with $N=117$, $S=139$.
  • Figure 4: S$\&$P$100$ sample covariance: signals, their $95\%$ confidence intervals, noise, and Marchenko-Pastur distribution with $N=92$, $\gamma^2=0.4$, $\sigma=0.02$.
  • Figure 5: Squared sample canonical correlations between cyclical and noncyclical stocks: signals, their $95\%$ confidence intervals, noise, and Wachter distribution with $N=M=80$, $S=520$.
  • ...and 1 more figures

Theorems & Definitions (99)

  • Proposition : jones1978eigenvaluefuredi1981eigenvaluescapitaine2009largestcapitaine2012central
  • Remark 2.1
  • Proposition : baik2005phasebaik2006eigenvaluespaul2007asymptoticsbai2008central
  • Proposition : onatski2012asymptoticsbenaych2012singular, onatski2018asymptotics
  • Proposition : bao2019canonicalyang2022limitingbai2022limitinghou2023spikedBG_CCA
  • Remark 2.2
  • Proposition : forrester1993spectrumtracy1996orthogonal
  • Theorem 4.1
  • Definition 4.2
  • Proposition 4.3
  • ...and 89 more