Table of Contents
Fetching ...

Sparse factor models of high dimension

Benjamin Poignard, Yoshikazu Terada

TL;DR

This work tackles high-dimensional factor models with sparsity in the loading matrix by casting estimation as a penalized M-estimation problem. With an orthogonal-factor setup and an Anderson-type identification condition, the authors prove sparsistency and estimator consistency as the cross-sectional dimension $p_n$ grows, and provide a GLS-based recovery of latent factors. They develop practical algorithms using SCAD and MCP penalties under Gaussian or least-squares losses, and demonstrate superior sparsity recovery and competitive predictive performance in both simulations and real-data applications, including portfolio GMVP and diffusion-index forecasting. The results advance variance-covariance estimation in high dimensions by enabling flexible, interpretable sparse loadings, with direct implications for risk management and macroeconomic prediction.

Abstract

We consider the estimation of a sparse factor model where the factor loading matrix is assumed sparse. The estimation problem is reformulated as a penalized M-estimation criterion, while the restrictions for identifying the factor loading matrix accommodate a wide range of sparsity patterns. We prove the sparsistency property of the penalized estimator when the number of parameters is diverging, that is the consistency of the estimator and the recovery of the true zeros entries. These theoretical results are illustrated by finite-sample simulation experiments, and the relevance of the proposed method is assessed by applications to portfolio allocation and macroeconomic data prediction.

Sparse factor models of high dimension

TL;DR

This work tackles high-dimensional factor models with sparsity in the loading matrix by casting estimation as a penalized M-estimation problem. With an orthogonal-factor setup and an Anderson-type identification condition, the authors prove sparsistency and estimator consistency as the cross-sectional dimension grows, and provide a GLS-based recovery of latent factors. They develop practical algorithms using SCAD and MCP penalties under Gaussian or least-squares losses, and demonstrate superior sparsity recovery and competitive predictive performance in both simulations and real-data applications, including portfolio GMVP and diffusion-index forecasting. The results advance variance-covariance estimation in high dimensions by enabling flexible, interpretable sparse loadings, with direct implications for risk management and macroeconomic prediction.

Abstract

We consider the estimation of a sparse factor model where the factor loading matrix is assumed sparse. The estimation problem is reformulated as a penalized M-estimation criterion, while the restrictions for identifying the factor loading matrix accommodate a wide range of sparsity patterns. We prove the sparsistency property of the penalized estimator when the number of parameters is diverging, that is the consistency of the estimator and the recovery of the true zeros entries. These theoretical results are illustrated by finite-sample simulation experiments, and the relevance of the proposed method is assessed by applications to portfolio allocation and macroeconomic data prediction.
Paper Structure (24 sections, 3 theorems, 83 equations, 8 figures, 8 tables)

This paper contains 24 sections, 3 theorems, 83 equations, 8 figures, 8 tables.

Key Result

Theorem 3.1

Suppose $3 r^{-1}_1 + 1.5r^{-1}_2+r^{-1}_3>1$. Let $\rho^{-1} = 3 r^{-1}_1 + 1.5r^{-1}_2+r^{-1}_3+1$. Assume $\log(p_n)^{6/\rho}=o(n)$ holds and Assumptions factor_assumption_1-assumption_regularity_penalty_n are satisfied. Then there exists a sequence of estimators $\widehat{\Sigma}_n=\widehat{\Lam

Figures (8)

  • Figure 1: True loading matrix
  • Figure 2: SOFAR estimator
  • Figure 4: True loading matrix
  • Figure 5: SOFAR estimator
  • Figure 6: Misspecified elements
  • ...and 3 more figures

Theorems & Definitions (4)

  • Theorem 3.1
  • Theorem 3.2
  • Lemma A.1
  • proof