Sparse factor models of high dimension

Benjamin Poignard; Yoshikazu Terada

Sparse factor models of high dimension

Benjamin Poignard, Yoshikazu Terada

TL;DR

This work tackles high-dimensional factor models with sparsity in the loading matrix by casting estimation as a penalized M-estimation problem. With an orthogonal-factor setup and an Anderson-type identification condition, the authors prove sparsistency and estimator consistency as the cross-sectional dimension $p_n$ grows, and provide a GLS-based recovery of latent factors. They develop practical algorithms using SCAD and MCP penalties under Gaussian or least-squares losses, and demonstrate superior sparsity recovery and competitive predictive performance in both simulations and real-data applications, including portfolio GMVP and diffusion-index forecasting. The results advance variance-covariance estimation in high dimensions by enabling flexible, interpretable sparse loadings, with direct implications for risk management and macroeconomic prediction.

Abstract

We consider the estimation of a sparse factor model where the factor loading matrix is assumed sparse. The estimation problem is reformulated as a penalized M-estimation criterion, while the restrictions for identifying the factor loading matrix accommodate a wide range of sparsity patterns. We prove the sparsistency property of the penalized estimator when the number of parameters is diverging, that is the consistency of the estimator and the recovery of the true zeros entries. These theoretical results are illustrated by finite-sample simulation experiments, and the relevance of the proposed method is assessed by applications to portfolio allocation and macroeconomic data prediction.

Sparse factor models of high dimension

TL;DR

grows, and provide a GLS-based recovery of latent factors. They develop practical algorithms using SCAD and MCP penalties under Gaussian or least-squares losses, and demonstrate superior sparsity recovery and competitive predictive performance in both simulations and real-data applications, including portfolio GMVP and diffusion-index forecasting. The results advance variance-covariance estimation in high dimensions by enabling flexible, interpretable sparse loadings, with direct implications for risk management and macroeconomic prediction.

Abstract

Paper Structure (24 sections, 3 theorems, 83 equations, 8 figures, 8 tables)

This paper contains 24 sections, 3 theorems, 83 equations, 8 figures, 8 tables.

Introduction
The framework
Asymptotic properties
Simulations
Real data applications
Portfolio allocation
Data
Global Minimum Variance Portfolio (GMVP)
Competing variance-covariance matrix estimators
Diffusion index data
Discussion and conclusion
Preliminary results
Proofs
Proof of Theorem \ref{['Theorem_existence_consistent']}
Proof of Theorem \ref{['sparsistency']}
...and 9 more sections

Key Result

Theorem 3.1

Suppose $3 r^{-1}_1 + 1.5r^{-1}_2+r^{-1}_3>1$. Let $\rho^{-1} = 3 r^{-1}_1 + 1.5r^{-1}_2+r^{-1}_3+1$. Assume $\log(p_n)^{6/\rho}=o(n)$ holds and Assumptions factor_assumption_1-assumption_regularity_penalty_n are satisfied. Then there exists a sequence of estimators $\widehat{\Sigma}_n=\widehat{\Lam

Figures (8)

Figure 1: True loading matrix
Figure 2: SOFAR estimator
Figure 4: True loading matrix
Figure 5: SOFAR estimator
Figure 6: Misspecified elements
...and 3 more figures

Theorems & Definitions (4)

Theorem 3.1
Theorem 3.2
Lemma A.1
proof

Sparse factor models of high dimension

TL;DR

Abstract

Sparse factor models of high dimension

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (4)