Consistent support recovery for high-dimensional diffusions
Dmytro Marushkevych, Francisco Pina, Mark Podolskij
TL;DR
This work develops a theoretical framework for adaptive Lasso in high-dimensional diffusions with sparse drift, proving sign consistency and asymptotic normality under sparsity and regularity assumptions. By leveraging a data-driven weighted $\ell_1$ penalty and a suitable pre-estimator, it extends oracle-like properties to diffusion settings where $p$ can grow with time horizon $T$ and dimension $d$, including a marginal pre-estimator for $p\gg d$ under partial orthogonality. The analysis relies on concentration inequalities for additive functionals and careful control of the Fisher information-type matrices, martingale terms, and event probabilities. Numerical experiments validate the theory, showing superior support recovery and estimation accuracy of the adaptive Lasso relative to standard Lasso and MLE, along with empirical evidence of asymptotic normality for the active coefficients. The results provide practical guidance on tuning and demonstrate the method's robustness for high-dimensional stochastic processes.
Abstract
Statistical inference for stochastic processes has advanced significantly due to applications in diverse fields, but challenges remain in high-dimensional settings where parameters are allowed to grow with the sample size. This paper analyzes a d-dimensional ergodic diffusion process under sparsity constraints, focusing on the adaptive Lasso estimator, which improves variable selection and bias over the standard Lasso. We derive conditions under which the adaptive Lasso achieves support recovery property and asymptotic normality for the drift parameter, with a focus on linear models. Explicit parameter relationships guide tuning for optimal performance, and a marginal estimator is proposed for p>>d scenarios under partial orthogonality assumption. Numerical studies confirm the adaptive Lasso's superiority over standard Lasso and MLE in accuracy and support recovery, providing robust solutions for high-dimensional stochastic processes.
