Table of Contents
Fetching ...

ARMAr-LASSO: Mitigating the Impact of Predictor Serial Correlation on the LASSO

Simone Tonini, Francesca Chiaromonte, Alessandro Giovannelli

TL;DR

The paper identifies a fundamental challenge for LASSO in time-series regressions: serial dependence in predictors and errors creates spurious correlations that impair estimation and forecasting. It introduces ARMAr-LASSO, which pre-whitens predictors via ARMA filtering and then applies LASSO to the ARMA residuals plus a small number of lagged responses, providing both finite-sample and asymptotic guarantees. The authors derive a density approximation for sample correlations under AR(1) structure, establish LASSO oracle inequalities and high-dimensional results under near-epoch dependence, and illustrate substantial gains through simulations and a Euro-area macroeconomic forecasting application. The method yields more parsimonious, accurate models and forecasts than several LASSO-based benchmarks, demonstrating robustness to factor structure and various ARMA specifications. These results offer a practical, theoretically grounded tool for high-dimensional time-series modeling where serial dependence threatens standard penalized regression approaches.

Abstract

We explore estimation and forecast accuracy for sparse linear models, focusing on scenarios where both predictors and errors carry serial correlations. We establish a clear link between predictor serial correlation and the performance of the LASSO, showing that even orthogonal or weakly correlated stationary AR processes can lead to significant spurious correlations due to their serial correlations. To address this challenge, we propose a novel approach named ARMAr-LASSO ({\em ARMA residuals LASSO}), which applies the LASSO to predictors that have been pre-whitened with ARMA filters and lags of dependent variable. We derive both asymptotic results and oracle inequalities for the ARMAr-LASSO, demonstrating that it effectively reduces estimation errors while also providing an effective forecasting and feature selection strategy. Our findings are supported by extensive simulations and an application to real-world macroeconomic data, which highlight the superior performance of the ARMAr-LASSO for handling sparse linear models in the context of time series.

ARMAr-LASSO: Mitigating the Impact of Predictor Serial Correlation on the LASSO

TL;DR

The paper identifies a fundamental challenge for LASSO in time-series regressions: serial dependence in predictors and errors creates spurious correlations that impair estimation and forecasting. It introduces ARMAr-LASSO, which pre-whitens predictors via ARMA filtering and then applies LASSO to the ARMA residuals plus a small number of lagged responses, providing both finite-sample and asymptotic guarantees. The authors derive a density approximation for sample correlations under AR(1) structure, establish LASSO oracle inequalities and high-dimensional results under near-epoch dependence, and illustrate substantial gains through simulations and a Euro-area macroeconomic forecasting application. The method yields more parsimonious, accurate models and forecasts than several LASSO-based benchmarks, demonstrating robustness to factor structure and various ARMA specifications. These results offer a practical, theoretically grounded tool for high-dimensional time-series modeling where serial dependence threatens standard penalized regression approaches.

Abstract

We explore estimation and forecast accuracy for sparse linear models, focusing on scenarios where both predictors and errors carry serial correlations. We establish a clear link between predictor serial correlation and the performance of the LASSO, showing that even orthogonal or weakly correlated stationary AR processes can lead to significant spurious correlations due to their serial correlations. To address this challenge, we propose a novel approach named ARMAr-LASSO ({\em ARMA residuals LASSO}), which applies the LASSO to predictors that have been pre-whitened with ARMA filters and lags of dependent variable. We derive both asymptotic results and oracle inequalities for the ARMAr-LASSO, demonstrating that it effectively reduces estimation errors while also providing an effective forecasting and feature selection strategy. Our findings are supported by extensive simulations and an application to real-world macroeconomic data, which highlight the superior performance of the ARMAr-LASSO for handling sparse linear models in the context of time series.
Paper Structure (8 sections, 3 theorems, 4 equations, 3 figures)

This paper contains 8 sections, 3 theorems, 4 equations, 3 figures.

Key Result

Proposition 1

Let $\{\mathbf{x}_t\}$ be a stationary $n$-variate Gaussian AR(1) process with autoregressive residuals $\mathbf{u}_t \sim N\mathopen{}\mathclose{\left(\pmb{0}_n,\pmb{I}_n\right)$. Let $\ddot{\phi}}=\phi_i\phi_j$, where $\phi_i$ and $\phi_j$ are the autoregressive coefficients of the $i$-th and $j$- where $T_v=\mathopen{}\mathclose{\left\lfloor \frac{(T-1)(1-\ddot{\phi})^2-(1-\ddot{\phi}^2)}{(1-\d

Figures (3)

  • Figure 1: Monte Carlo densities $d(r)$ of $\widehat{c}_{ij}^{x}$ for different values of $T$ and $\phi$.
  • Figure 2: Monte Carlo densities $d(r)$ (blue histograms and dashed lines) and $\mathcal{D}(r)$ (red lines) for $\nu = 20$ and different values of $\phi$. The $p$-values correspond to the Shapiro test for Gaussianity.
  • Figure 3: Numerical "toy example". Panel (a) $T=50$, Panel (b) $T=100$, Panel (c) $T=250$. Orange circles/bars and blue triangles/bars represent, respectively, means/standard deviations of $\max_{i\neq j}|\widehat{c}_{ij}^x|$ and $\widehat{\psi}_{min}^x$, for various values of $\phi$, as obtained from 5000 Monte Carlo simulations.

Theorems & Definitions (10)

  • Proposition 1
  • Remark 1
  • Remark 2
  • Proposition 2
  • Corollary 1
  • Remark 1
  • Example 1
  • Remark 2
  • Example 2
  • Example 3