Asymptotics of Linear Regression with Linearly Dependent Data
Behrad Moniri, Hamed Hassani
TL;DR
This work addresses the high-dimensional ridge regression problem with linearly dependent covariates exhibiting spatio-temporal covariance. It establishes a Gaussian universality principle, showing that the asymptotic estimation error is invariant to non-Gaussian perturbations in the covariates when mean and covariance are preserved, and derives a precise fixed-point characterization of the error via spectral data of the covariate transforms. The authors show that the limiting error is governed by a scalar function $m(\lambda;\gamma)$ and its fixed-point solution $\kappa$, enabling efficient computation of bias, variance, and total error, and they identify the optimal ridge parameter $\lambda_\star=\sigma_\varepsilon^2\gamma/\alpha^2$ independent of dependence structure. They also analyze how dependence affects overparameterization and the double descent phenomenon, supported by simulations that align with theoretical predictions. Overall, the results provide actionable insights for high-dimensional regression under dependence, including precise error characterizations and guidance on regularization in structured covariate settings.
Abstract
In this paper we study the asymptotics of linear regression in settings with non-Gaussian covariates where the covariates exhibit a linear dependency structure, departing from the standard assumption of independence. We model the covariates using stochastic processes with spatio-temporal covariance and analyze the performance of ridge regression in the high-dimensional proportional regime, where the number of samples and feature dimensions grow proportionally. A Gaussian universality theorem is proven, demonstrating that the asymptotics are invariant under replacing the non-Gaussian covariates with Gaussian vectors preserving mean and covariance, for which tools from random matrix theory can be used to derive precise characterizations of the estimation error. The estimation error is characterized by a fixed-point equation involving the spectral properties of the spatio-temporal covariance matrices, enabling efficient computation. We then study optimal regularization, overparameterization, and the double descent phenomenon in the context of dependent data. Simulations validate our theoretical predictions, shedding light on how dependencies influence estimation error and the choice of regularization parameters.
