Matching Criterion for Identifiability in Sparse Factor Analysis
Nils Sturma, Miriam Kranzlmueller, Irem Portakal, Mathias Drton
TL;DR
This work tackles the identifiability problem in sparse confirmatory factor analysis by introducing a graphical, local matching criterion that certifies generic sign-identifiability of the factor loading matrix up to column signs. The main contributions are the Matching Criterion (and its extension), which operate on bipartite factor-analysis graphs and are decidable in polynomial time under bounded search, and the demonstration that these criteria subsume AR- and BB-identifiability in ZUTA graphs while also covering new sparse scenarios. Through simulations and a real-data POPPA case study, the authors show that extended M-identifiability can certify identifiability beyond traditional criteria and provide practical guidance for thresholding in exploratory factor analysis. The framework enables reliable interpretation of sparse latent structures and has implications for model selection, goodness-of-fit, and Bayesian sparse factor analysis.
Abstract
Factor analysis models explain dependence among observed variables by a smaller number of unobserved factors. A main challenge in confirmatory factor analysis is determining whether the factor loading matrix is identifiable from the observed covariance matrix. The factor loading matrix captures the linear effects of the factors and, if unrestricted, can only be identified up to an orthogonal transformation of the factors. However, in many applications the factor loadings exhibit an interesting sparsity pattern that may lead to identifiability up to column signs. We study this phenomenon by connecting sparse confirmatory factor analysis models to bipartite graphs and providing sufficient graphical conditions for identifiability of the factor loading matrix up to column signs. In contrast to previous work, our main contribution, the matching criterion, exploits sparsity by operating locally on the graph structure, thereby improving existing conditions. Our criterion is efficiently decidable in time that is polynomial in the size of the graph, when restricting the search steps to sets of bounded size.
