$\ell_1$-Regularized Generalized Least Squares

Kaveh S. Nobari; Alex Gibberd

$\ell_1$-Regularized Generalized Least Squares

Kaveh S. Nobari, Alex Gibberd

TL;DR

This paper develops an $\ell_{1}$-regularized generalized least squares framework to handle high-dimensional regressions with autocorrelated errors, focusing on autoregressive noise modeled by AR($q$). It introduces a three-step procedure: (i) LASSO on the original data, (ii) AR fit to residuals, and (iii) a second-stage LASSO on whitened data, with a feasible variant that estimates the AR structure from data. The authors prove that, under sub-Gaussian design and AR error models, the rotated design retains a strong restricted eigenvalue condition and obtain oracle-type error bounds for GLS-LASSO; the feasible GLS-LASSO inherits similar guarantees with additional error from AR parameter estimation. Through simulations, GLS-LASSO and FGLS-LASSO outperform standard LASSO when errors are highly autocorrelated, approaching the iid performance in white-noise cases and demonstrating notable finite-sample gains in persistence. Overall, the work provides a practically implementable method to improve sparse regression estimation in the presence of autocorrelated disturbances, with explicit nonasymptotic error bounds and supporting empirical evidence.

Abstract

We study an $\ell_{1}$-regularized generalized least-squares (GLS) estimator for high-dimensional regressions with autocorrelated errors. Specifically, we consider the case where errors are assumed to follow an autoregressive process, alongside a feasible variant of GLS that estimates the structure of this process in a data-driven manner. The estimation procedure consists of three steps: performing a LASSO regression, fitting an autoregressive model to the realized residuals, and then running a second-stage LASSO regression on the rotated (whitened) data. We examine the theoretical performance of the method in a sub-Gaussian random-design setting, in particular assessing the impact of the rotation on the design matrix and how this impacts the estimation error of the procedure. We show that our proposed estimators maintain smaller estimation error than an unadjusted LASSO regression when the errors are driven by an autoregressive process. A simulation study verifies the performance of the proposed method, demonstrating that the penalized (feasible) GLS-LASSO estimator performs on par with the LASSO in the case of white noise errors, whilst outperforming when the errors exhibit significant autocorrelation.

$\ell_1$-Regularized Generalized Least Squares

TL;DR

This paper develops an

-regularized generalized least squares framework to handle high-dimensional regressions with autocorrelated errors, focusing on autoregressive noise modeled by AR(

). It introduces a three-step procedure: (i) LASSO on the original data, (ii) AR fit to residuals, and (iii) a second-stage LASSO on whitened data, with a feasible variant that estimates the AR structure from data. The authors prove that, under sub-Gaussian design and AR error models, the rotated design retains a strong restricted eigenvalue condition and obtain oracle-type error bounds for GLS-LASSO; the feasible GLS-LASSO inherits similar guarantees with additional error from AR parameter estimation. Through simulations, GLS-LASSO and FGLS-LASSO outperform standard LASSO when errors are highly autocorrelated, approaching the iid performance in white-noise cases and demonstrating notable finite-sample gains in persistence. Overall, the work provides a practically implementable method to improve sparse regression estimation in the presence of autocorrelated disturbances, with explicit nonasymptotic error bounds and supporting empirical evidence.

Abstract

We study an

-regularized generalized least-squares (GLS) estimator for high-dimensional regressions with autocorrelated errors. Specifically, we consider the case where errors are assumed to follow an autoregressive process, alongside a feasible variant of GLS that estimates the structure of this process in a data-driven manner. The estimation procedure consists of three steps: performing a LASSO regression, fitting an autoregressive model to the realized residuals, and then running a second-stage LASSO regression on the rotated (whitened) data. We examine the theoretical performance of the method in a sub-Gaussian random-design setting, in particular assessing the impact of the rotation on the design matrix and how this impacts the estimation error of the procedure. We show that our proposed estimators maintain smaller estimation error than an unadjusted LASSO regression when the errors are driven by an autoregressive process. A simulation study verifies the performance of the proposed method, demonstrating that the penalized (feasible) GLS-LASSO estimator performs on par with the LASSO in the case of white noise errors, whilst outperforming when the errors exhibit significant autocorrelation.

Paper Structure (20 sections, 14 theorems, 169 equations, 8 figures)

This paper contains 20 sections, 14 theorems, 169 equations, 8 figures.

Introduction
The LASSO with Dependent Errors
Error Bounds for the LASSO
The AR($q$) Example
Empirical Behavior
Regularized Generalized Least Squares
GLS-LASSO
Feasible GLS-LASSO
Experimental Results
Simulation Setup
Results
Discussion
Appendix
Proofs for Section 2. and Preliminaries
Proof of Proposition X (sub-Gaussian)
...and 5 more sections

Key Result

Lemma 2.1

Under Assumption ass:1 we have where $c>0$ is an absolute constant.

Figures (8)

Figure 1: Heatmaps of $\Gamma$ (top) and $\Psi$ (bottom) corresponding to various stationary AR processes, with $\sigma_{u}^{2}=1$. Left: AR(1) $\phi_{1}=0.9$; Middle: AR(2) $\phi=(1.96,-0.97)$; Right: AR(10) $\phi_{j}=0$ for $j=1,\ldots,9$ and $\phi_{10}=0.9$. Note: whilst $\Gamma$ is always Toeplitz, the corresponding $\Psi$ is Toeplitz only for columns $j>q$.
Figure 2: Comparison of the growth in $\|L\|_{F}$ for an AR(1) process as a function of $n$, for different $\phi_{1}$ and initial variances $v_{0}$.
Figure 3: Estimation error $p^{-1/2}\|\hat{\Delta}\|_{2}$ achieved by the LASSO with varying AR($1$) error parameter. Black line indicates performance in independent error setting $\sigma_{u}^{2}=1$. Solid and dashed lines represent the mean and 95% confidence intervals respectively, based on 1000 simulations.
Figure 4: Comparison of $R^{\top}R$ and $\Sigma_{\mathrm{MA}(\pi)}$ (and their inverses) obtained when $e_{t}$ is generated by an AR(2) model with $\phi_{1}\approx2$, $\phi_{2}\approx-1$, i.e. on the edge of the stationary regime. Left and middle plots are generated for $n=10$ samples to highlight the discrepancy at the boundary, whilst the figures on the right represent the same plots for $n=100$ showing good approximation to the stationary autocovariance associated with this AR(2) model.
Figure 5: Estimation error (top: $p^{-1/2}\|\hat{\Delta}\|_{2}$, bottom: $\|\hat{\Delta}\|_{\infty}$error) as a function of $n$ for $p=512$ for different settings of $\rho$, dashed lines indicate empirical 95% confidence intervals.
...and 3 more figures

Theorems & Definitions (27)

Lemma 2.1
Corollary 2.1
Lemma 3.1
Proposition 3.1
Lemma 3.2
Proposition 3.2
Proposition 3.3
Corollary 3.1
Proposition 3.4
proof
...and 17 more

$\ell_1$-Regularized Generalized Least Squares

TL;DR

Abstract

$\ell_1$-Regularized Generalized Least Squares

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (27)