Table of Contents
Fetching ...

A Note on the Likelihood Ratio Test in High-Dimensional Exploratory Factor Analysis

Yinqiu He, Zi Wang, Gongjun Xu

TL;DR

This work addresses the validity of the likelihood ratio test's chi-square approximation in high-dimensional exploratory factor analysis, where the response dimension $p$ grows with the sample size $N$. By analyzing the moment generating functions of the LR statistics under the null, the authors establish necessary and sufficient conditions for the chi-square limits to hold: for the uncorrected statistic $T_0$, the approximation is valid when $p/N^{1/2} o 0$, while for the Bartlett-corrected statistic $\rho_0 T_0$ it is valid when $p/N^{2/3}\to 0$; these results extend to generalized tests $T'$ and $T_k$ with corresponding $f'$ and similar phase-transition boundaries. The findings yield practical guidelines for practitioners, showing when chi-square approximations (and Bartlett-corrected versions) are reliable in high-dimensional EFA, and highlighting that Bartlett correction can extend validity to larger $p$. The paper also demonstrates robustness under non-normal data in supplementary simulations, informing sample-size planning and suggesting directions for extending high-dimensional LR theory to other latent-factor settings and fit indices.

Abstract

The likelihood ratio test is widely used in exploratory factor analysis to assess the model fit and determine the number of latent factors. Despite its popularity and clear statistical rationale, researchers have found that when the dimension of the response data is large compared to the sample size, the classical chi-square approximation of the likelihood ratio test statistic often fails. Theoretically, it has been an open problem when such a phenomenon happens as the dimension of data increases; practically, the effect of high dimensionality is less examined in exploratory factor analysis, and there lacks a clear statistical guideline on the validity of the conventional chi-square approximation. To address this problem, we investigate the failure of the chi-square approximation of the likelihood ratio test in high-dimensional exploratory factor analysis, and derive the necessary and sufficient condition to ensure the validity of the chi-square approximation. The results yield simple quantitative guidelines to check in practice and would also provide useful statistical insights into the practice of exploratory factor analysis.

A Note on the Likelihood Ratio Test in High-Dimensional Exploratory Factor Analysis

TL;DR

This work addresses the validity of the likelihood ratio test's chi-square approximation in high-dimensional exploratory factor analysis, where the response dimension grows with the sample size . By analyzing the moment generating functions of the LR statistics under the null, the authors establish necessary and sufficient conditions for the chi-square limits to hold: for the uncorrected statistic , the approximation is valid when , while for the Bartlett-corrected statistic it is valid when ; these results extend to generalized tests and with corresponding and similar phase-transition boundaries. The findings yield practical guidelines for practitioners, showing when chi-square approximations (and Bartlett-corrected versions) are reliable in high-dimensional EFA, and highlighting that Bartlett correction can extend validity to larger . The paper also demonstrates robustness under non-normal data in supplementary simulations, informing sample-size planning and suggesting directions for extending high-dimensional LR theory to other latent-factor settings and fit indices.

Abstract

The likelihood ratio test is widely used in exploratory factor analysis to assess the model fit and determine the number of latent factors. Despite its popularity and clear statistical rationale, researchers have found that when the dimension of the response data is large compared to the sample size, the classical chi-square approximation of the likelihood ratio test statistic often fails. Theoretically, it has been an open problem when such a phenomenon happens as the dimension of data increases; practically, the effect of high dimensionality is less examined in exploratory factor analysis, and there lacks a clear statistical guideline on the validity of the conventional chi-square approximation. To address this problem, we investigate the failure of the chi-square approximation of the likelihood ratio test in high-dimensional exploratory factor analysis, and derive the necessary and sufficient condition to ensure the validity of the chi-square approximation. The results yield simple quantitative guidelines to check in practice and would also provide useful statistical insights into the practice of exploratory factor analysis.

Paper Structure

This paper contains 18 sections, 3 theorems, 39 equations, 11 figures, 1 table.

Key Result

Theorem 1

Suppose $N \geq p+5$. Let $\chi_{f_0}^{2}(\alpha)$ denote the upper-level $\alpha$-quantile of the $\chi^2_{f_0}$ distribution. Under $H_{0,0}: k_0=0$, as $N\to \infty$,

Figures (11)

  • Figure 1: Histograms of $T_0$ and $\rho_0T_0$ with the density curves of $\chi^2_{f_0}$
  • Figure 2: Estimated type I error versus $\varepsilon$ when $k_0=0$
  • Figure 3: Estimated type I error versus $\varepsilon$ when $k_0=1$
  • Figure 4: Estimated type I error versus $\varepsilon$ when $k_0=3$
  • Figure 5: Estimated type I error versus $\varepsilon$ of $t_5$-distributed data
  • ...and 6 more figures

Theorems & Definitions (6)

  • Theorem 1
  • Theorem 2
  • Remark 1
  • Remark 2
  • Lemma 3
  • proof