LASSO Inference for High Dimensional Predictive Regressions
Zhan Gao, Ji Hyung Lee, Ziwei Mei, Zhentao Shi
TL;DR
This paper tackles valid inference in high-dimensional predictive regressions where regressors can be mixed between local-to-unit-root and stationary processes. It introduces XDlasso, a two-stage debiasing approach that combines an IVX-based instrument with a desparsified LASSO to remove both shrinkage bias and Stambaugh bias, yielding asymptotically normal estimators and Wald tests without prior knowledge of which regressors are nonstationary. The authors establish RE/DB conditions, prove consistency of the Slasso workhorse, derive the auxiliary regression rates, and prove asymptotic normality of the XDlasso estimators as well as a Wald test for joint hypotheses. Monte Carlo simulations show XDlasso achieves accurate size and competitive power under mixed persistence and heteroskedasticity, while empirical applications to stock return predictability and inflation demonstrate robustness and the practical value of inference in data-rich macro-financial settings.
Abstract
LASSO inflicts shrinkage bias on estimated coefficients, which undermines asymptotic normality and invalidates standard inferential procedures based on the t-statistic. Given cross sectional data, the desparsified LASSO has emerged as a well-known remedy for correcting the shrinkage bias. In the context of high dimensional predictive regression, the desparsified LASSO faces an additional challenge: the Stambaugh bias arising from nonstationary regressors modeled as local unit roots. To restore standard inference, we propose a novel estimator called IVX-desparsified LASSO (XDlasso). XDlasso simultaneously eliminates both shrinkage bias and Stambaugh bias and does not require prior knowledge about the identities of nonstationary and stationary regressors. We establish the asymptotic properties of XDlasso for hypothesis testing, and our theoretical findings are supported by Monte Carlo simulations. Applying our method to real-world applications from the FRED-MD database, we investigate two important empirical questions: (i) the predictability of the U.S. stock returns based on the earnings-price ratio, and (ii) the predictability of the U.S. inflation using the unemployment.
