Table of Contents
Fetching ...

Large covariance matrix estimation via penalized log-det heuristics

Enrico Bernardi, Matteo Farnè

TL;DR

The paper tackles the challenge of estimating large covariance matrices in high dimensions by modeling the true covariance as a low-rank plus sparse decomposition and optimizing a log-det loss augmented with nuclear norm and $\ell_1$ penalties. It establishes that the log-det objective is locally convex with a Lipschitz gradient, enabling a proximal gradient algorithm to recover the low-rank and sparse components in one step, while also identifying the latent rank and residual sparsity pattern with high probability. The authors prove algebraic and parametric consistency under a generalized approximate factor model, provide identifiability conditions via tangent-space geometry, and show that the log-det approach performs not worse than Frobenius-based Frobenius-loss estimators in terms of estimation error, validated by simulations and an ECB real-data example. The work advances high-dimensional covariance estimation by coupling eigenvalue-aware spectrum control with joint structure recovery, offering practical computation strategies and theoretical guarantees for applications in finance and beyond.

Abstract

This paper provides a comprehensive estimation framework for large covariance matrices via a log-det heuristics augmented by a nuclear norm plus $\ell_{1}$-norm penalty. We develop the model framework, which includes high-dimensional approximate factor models with a sparse residual covariance. We prove that the aforementioned log-det heuristics is locally convex with a Lipschitz-continuous gradient, so that a proximal gradient algorithm may be stated to numerically solve the problem while controlling the threshold parameters. The proposed optimization strategy recovers in a single step both the covariance matrix components and the latent rank and the residual sparsity pattern with high probability, and performs systematically not worse than the corresponding estimators employing Frobenius loss in place of the log-det heuristics. The error bounds for the ensuing low rank and sparse covariance matrix estimators are established, and the identifiability conditions for the latent geometric manifolds are provided, improving existing literature. The validity of outlined results is highlighted by an exhaustive simulation study and a financial data example involving Euro Area banks.

Large covariance matrix estimation via penalized log-det heuristics

TL;DR

The paper tackles the challenge of estimating large covariance matrices in high dimensions by modeling the true covariance as a low-rank plus sparse decomposition and optimizing a log-det loss augmented with nuclear norm and penalties. It establishes that the log-det objective is locally convex with a Lipschitz gradient, enabling a proximal gradient algorithm to recover the low-rank and sparse components in one step, while also identifying the latent rank and residual sparsity pattern with high probability. The authors prove algebraic and parametric consistency under a generalized approximate factor model, provide identifiability conditions via tangent-space geometry, and show that the log-det approach performs not worse than Frobenius-based Frobenius-loss estimators in terms of estimation error, validated by simulations and an ECB real-data example. The work advances high-dimensional covariance estimation by coupling eigenvalue-aware spectrum control with joint structure recovery, offering practical computation strategies and theoretical guarantees for applications in finance and beyond.

Abstract

This paper provides a comprehensive estimation framework for large covariance matrices via a log-det heuristics augmented by a nuclear norm plus -norm penalty. We develop the model framework, which includes high-dimensional approximate factor models with a sparse residual covariance. We prove that the aforementioned log-det heuristics is locally convex with a Lipschitz-continuous gradient, so that a proximal gradient algorithm may be stated to numerically solve the problem while controlling the threshold parameters. The proposed optimization strategy recovers in a single step both the covariance matrix components and the latent rank and the residual sparsity pattern with high probability, and performs systematically not worse than the corresponding estimators employing Frobenius loss in place of the log-det heuristics. The error bounds for the ensuing low rank and sparse covariance matrix estimators are established, and the identifiability conditions for the latent geometric manifolds are provided, improving existing literature. The validity of outlined results is highlighted by an exhaustive simulation study and a financial data example involving Euro Area banks.
Paper Structure (50 sections, 23 theorems, 187 equations, 1 figure, 16 tables, 2 algorithms)

This paper contains 50 sections, 23 theorems, 187 equations, 1 figure, 16 tables, 2 algorithms.

Key Result

Theorem 2.1

Let us set $\psi_{0}=\sqrt{\ln(p)/n}/(\widetilde{\kappa}_{L} r/p)^{1/3}$, $\gamma \in \left[2{(\widetilde{\kappa}_{L} r/p)^{1/3}},{1}/{4}\right]$, $\rho_{0}=\gamma\psi_{0}$, with $p \geq 512 \widetilde{\kappa}_{L} r$, $\psi=p\psi_{0}$, and $\rho=\rho_0$, where $\psi$ and $\rho$ are the thresholds in

Figures (1)

  • Figure 1: ECB data: top six sample eigenvalues.

Theorems & Definitions (32)

  • Theorem 2.1
  • Corollary 2.1
  • Definition 1
  • Definition 2
  • Proposition 4.1
  • Proposition 4.2
  • Remark 1
  • Remark 2
  • Theorem 5.1
  • Theorem 5.2
  • ...and 22 more