Inverse Covariance and Partial Correlation Matrix Estimation via Joint Partial Regression
Samuel Erickson, Tobias Rydén
TL;DR
This work tackles the problem of estimating sparse high-dimensional inverse covariance (precision) and partial correlation matrices from data. It introduces a two-stage, convex joint partial regression framework that simultaneously estimates the precision matrix and the partial-correlation matrix under a positive semidefinite constraint, using an initial Lasso-based step to obtain residual variances. The authors derive non-asymptotic error rates for sub-Gaussian data: $\|\widehat{Q}-Q^\star\|_F \lesssim \sqrt{\frac{s\log p}{n}}$ and $\|\widehat{\Omega}-\Omega^\star\|_F \lesssim \sqrt{\frac{(s+p)\log p}{n}}$, and present an efficient proximal-splitting algorithm (pd3o) with per-iteration cost $O(p^3)$ suitable for large $p$. Empirical results on synthetic and stock-market data show the method consistently improves estimation accuracy over the graphical lasso and yields interpretable networks reflecting sector structure. The approach offers a scalable, theoretically grounded alternative for high-dimensional graphical-model estimation with practical impact in genomics, finance, and beyond.
Abstract
We present a method for estimating sparse high-dimensional inverse covariance and partial correlation matrices, which exploits the connection between the inverse covariance matrix and linear regression. The method is a two-stage estimation method wherein each individual feature is regressed on all other features while positive semi-definiteness is enforced simultaneously. We derive non-asymptotic estimation rates for both inverse covariance and partial correlation matrix estimation. An efficient proximal splitting algorithm for numerically computing the estimate is also dervied. The effectiveness of the proposed method is demonstrated on both synthetic and real-world data.
