Sparse factor models of high dimension
Benjamin Poignard, Yoshikazu Terada
TL;DR
This work tackles high-dimensional factor models with sparsity in the loading matrix by casting estimation as a penalized M-estimation problem. With an orthogonal-factor setup and an Anderson-type identification condition, the authors prove sparsistency and estimator consistency as the cross-sectional dimension $p_n$ grows, and provide a GLS-based recovery of latent factors. They develop practical algorithms using SCAD and MCP penalties under Gaussian or least-squares losses, and demonstrate superior sparsity recovery and competitive predictive performance in both simulations and real-data applications, including portfolio GMVP and diffusion-index forecasting. The results advance variance-covariance estimation in high dimensions by enabling flexible, interpretable sparse loadings, with direct implications for risk management and macroeconomic prediction.
Abstract
We consider the estimation of a sparse factor model where the factor loading matrix is assumed sparse. The estimation problem is reformulated as a penalized M-estimation criterion, while the restrictions for identifying the factor loading matrix accommodate a wide range of sparsity patterns. We prove the sparsistency property of the penalized estimator when the number of parameters is diverging, that is the consistency of the estimator and the recovery of the true zeros entries. These theoretical results are illustrated by finite-sample simulation experiments, and the relevance of the proposed method is assessed by applications to portfolio allocation and macroeconomic data prediction.
