High-Performance Variance-Covariance Matrix Construction Using an Uncentered Gram Formulation
Felix Reichel
TL;DR
This work presents a uncentered Gram-based formulation for covariance construction, showing that $\widehat{\boldsymbol{\Sigma}}=\frac{1}{n(n-1)}\bigl(n\mathbf{X}^T\mathbf{X}-\mathbf{s}\mathbf{s}^T\bigr)$ with $\mathbf{s}=\mathbf{X}^T\mathbf{1}_n$ is algebraically identical to the standard centered estimator $\frac{1}{n-1}\mathbf{X}^T\mathbf{H}\mathbf{X}$. By avoiding explicit centering, the method reduces memory traffic and concentrates work into a single $p\times p$ outer product, leveraging BLAS-3 kernels; RXTX variants can further speed up Gram-based computations. The authors provide thorough finite-precision validation, benchmark against numpy.cov, and discuss practical advantages in non–BLAS-tuned environments. They also outline concrete applications, including sandwich covariances, panel/fixed effects, JIT streaming, and resampling with aggregated statistics, highlighting the method’s relevance for large-scale or privacy-conscious settings.
Abstract
Reichel (2025) defined the bariance as a pairwise-difference measure that can be rewritten in linear time using only scalar sums. We extend this idea to the covariance matrix by showing that the standard matrix expression involving the uncentered Gram matrix and a correction term is algebraically identical to the pairwise-difference definition while avoiding explicit centering. The computation then reduces to one outer product of dimension p-by-p and a single subtraction. Benchmarks in Python show clear runtime gains, especially when BLAS optimizations are absent. Optionally faster Gram-matrix routines such as RXTX (Rybin et al., 2025) further reduce overall cost.
