Table of Contents
Fetching ...

The Spurious Factor Dilemma: Robust Inference in Heavy-Tailed Elliptical Factor Models

Jiang Hu, Jiahui Xie, Yangchun Zhang, Wang Zhou

Abstract

Standard methods for determining the number of factors often overestimate the true number when data exhibit heavy-tailed randomness, misinterpreting noise-induced outliers as genuine factors. This paper addresses this challenge within the framework of Elliptical Factor Models (EFM), which accommodate both heavy tails and potential non-linear dependencies common in real-world data. We demonstrate, both theoretically and empirically, that heavy-tailed noise generates spurious eigenvalues that mimic true factor signals. To distinguish these, we propose a novel methodology based on a fluctuation magnification algorithm. Under mild conditions, we show that, by magnifying perturbations, the eigenvalues associated with real factors exhibit significantly less fluctuation (stabilizing asymptotically) than spurious eigenvalues arising from heavy-tailed effects. We develop a formal testing procedure based on this principle and apply it to the problem of accurately selecting the number of common factors in heavy-tailed EFMs. Simulation studies and real data analysis confirm the effectiveness of our approach, particularly in scenarios with pronounced heavy-tailedness.

The Spurious Factor Dilemma: Robust Inference in Heavy-Tailed Elliptical Factor Models

Abstract

Standard methods for determining the number of factors often overestimate the true number when data exhibit heavy-tailed randomness, misinterpreting noise-induced outliers as genuine factors. This paper addresses this challenge within the framework of Elliptical Factor Models (EFM), which accommodate both heavy tails and potential non-linear dependencies common in real-world data. We demonstrate, both theoretically and empirically, that heavy-tailed noise generates spurious eigenvalues that mimic true factor signals. To distinguish these, we propose a novel methodology based on a fluctuation magnification algorithm. Under mild conditions, we show that, by magnifying perturbations, the eigenvalues associated with real factors exhibit significantly less fluctuation (stabilizing asymptotically) than spurious eigenvalues arising from heavy-tailed effects. We develop a formal testing procedure based on this principle and apply it to the problem of accurately selecting the number of common factors in heavy-tailed EFMs. Simulation studies and real data analysis confirm the effectiveness of our approach, particularly in scenarios with pronounced heavy-tailedness.

Paper Structure

This paper contains 37 sections, 25 theorems, 247 equations, 4 figures, 5 tables, 2 algorithms.

Key Result

Theorem 1

Under Assumptions ass_xi and ass_sigma, we have:

Figures (4)

  • Figure 1: Illustration of sample eigenvalues from an EFM with $p=n=1000$. The data follows a multivariate t-distribution with 4 degrees of freedom ($\alpha=2$). The population covariance is $\Sigma = \operatorname{diag}(7, 1, \dots, 1)$ ($m=1$, $\sigma_1=7$). Besides the eigenvalue near 7 (real signal), a spurious eigenvalue appears around 5.5, well-separated from the main bulk sample eigenvalues.
  • Figure 2: Detecting behavior under case (i) with $\mathbf{y}\sim t(4.3)$. Screening and "On" in (a) and (b) indicate that $k = 3$, suggesting the emergence of spurious signals that lead to an overestimation of the true value, $2$. Algorithm \ref{['alg_firstround_bootstrap']} accurately identifies the initial spurious signal at $3$ by detecting abnormal variance.
  • Figure 3: Detecting behavior under case (iii) with $\mathbf{y}\sim t(2.5)$. Screening and "On" in (a) and (b) indicate $\widehat{r} = 3$, suggesting the emergence of spurious signals that lead to an overestimation of the true value, $2$. Algorithm \ref{['alg_firstround_bootstrap']} accurately identifies the initial spurious signal at $3$ by detecting abnormal variance.
  • Figure 4: (a) Fluctuation in variance on period "2000-2004". (b) Fluctuation in variance on period "2004-2008". (c) Change in the number of factors from 1992 to 2024 every 48 months.

Theorems & Definitions (52)

  • Remark 1: Heavy Tails
  • Remark 2
  • Remark 3
  • Remark 4
  • Theorem 1: First Order Approximation
  • Remark 5
  • Lemma 1: Properties of $\theta_i, \zeta_i$
  • Theorem 2: Second Order Approximation
  • Theorem 3
  • Remark 6
  • ...and 42 more