Table of Contents
Fetching ...

Consistent support recovery for high-dimensional diffusions

Dmytro Marushkevych, Francisco Pina, Mark Podolskij

TL;DR

This work develops a theoretical framework for adaptive Lasso in high-dimensional diffusions with sparse drift, proving sign consistency and asymptotic normality under sparsity and regularity assumptions. By leveraging a data-driven weighted $\ell_1$ penalty and a suitable pre-estimator, it extends oracle-like properties to diffusion settings where $p$ can grow with time horizon $T$ and dimension $d$, including a marginal pre-estimator for $p\gg d$ under partial orthogonality. The analysis relies on concentration inequalities for additive functionals and careful control of the Fisher information-type matrices, martingale terms, and event probabilities. Numerical experiments validate the theory, showing superior support recovery and estimation accuracy of the adaptive Lasso relative to standard Lasso and MLE, along with empirical evidence of asymptotic normality for the active coefficients. The results provide practical guidance on tuning and demonstrate the method's robustness for high-dimensional stochastic processes.

Abstract

Statistical inference for stochastic processes has advanced significantly due to applications in diverse fields, but challenges remain in high-dimensional settings where parameters are allowed to grow with the sample size. This paper analyzes a d-dimensional ergodic diffusion process under sparsity constraints, focusing on the adaptive Lasso estimator, which improves variable selection and bias over the standard Lasso. We derive conditions under which the adaptive Lasso achieves support recovery property and asymptotic normality for the drift parameter, with a focus on linear models. Explicit parameter relationships guide tuning for optimal performance, and a marginal estimator is proposed for p>>d scenarios under partial orthogonality assumption. Numerical studies confirm the adaptive Lasso's superiority over standard Lasso and MLE in accuracy and support recovery, providing robust solutions for high-dimensional stochastic processes.

Consistent support recovery for high-dimensional diffusions

TL;DR

This work develops a theoretical framework for adaptive Lasso in high-dimensional diffusions with sparse drift, proving sign consistency and asymptotic normality under sparsity and regularity assumptions. By leveraging a data-driven weighted penalty and a suitable pre-estimator, it extends oracle-like properties to diffusion settings where can grow with time horizon and dimension , including a marginal pre-estimator for under partial orthogonality. The analysis relies on concentration inequalities for additive functionals and careful control of the Fisher information-type matrices, martingale terms, and event probabilities. Numerical experiments validate the theory, showing superior support recovery and estimation accuracy of the adaptive Lasso relative to standard Lasso and MLE, along with empirical evidence of asymptotic normality for the active coefficients. The results provide practical guidance on tuning and demonstrate the method's robustness for high-dimensional stochastic processes.

Abstract

Statistical inference for stochastic processes has advanced significantly due to applications in diverse fields, but challenges remain in high-dimensional settings where parameters are allowed to grow with the sample size. This paper analyzes a d-dimensional ergodic diffusion process under sparsity constraints, focusing on the adaptive Lasso estimator, which improves variable selection and bias over the standard Lasso. We derive conditions under which the adaptive Lasso achieves support recovery property and asymptotic normality for the drift parameter, with a focus on linear models. Explicit parameter relationships guide tuning for optimal performance, and a marginal estimator is proposed for p>>d scenarios under partial orthogonality assumption. Numerical studies confirm the adaptive Lasso's superiority over standard Lasso and MLE in accuracy and support recovery, providing robust solutions for high-dimensional stochastic processes.

Paper Structure

This paper contains 13 sections, 14 theorems, 102 equations, 5 figures.

Key Result

Theorem 2.5

Assume assumption:A1-assumption:A3, assumption:B1-assumption:B3 and assumption:C hold. Then we have that

Figures (5)

  • Figure 1: Comparison of the overall performance of the MLE, Lasso estimator, and adaptive Lasso in terms of support recovery and prediction accuracy.
  • Figure 2: Comparison of the support recovery errors over time for the MLE, Lasso estimator, and adaptive Lasso, including the corresponding standard deviations.
  • Figure 3: $l_1$ mean error for the MLE, the Lasso estimator and the adaptive Lasso $\pm$ one standard deviation.
  • Figure 4: $l_2$ mean error for the MLE, the Lasso estimator and the adaptive Lasso $\pm$ one standard deviation.
  • Figure 5: Empirical density distributions at different time horizons compared with the theoretical standard Gaussian density.

Theorems & Definitions (38)

  • Remark 1.1
  • Remark 1.2
  • Definition 1
  • Remark 2.1
  • Remark 2.2
  • Remark 2.3
  • Remark 2.4
  • Theorem 2.5
  • proof
  • Theorem 2.6
  • ...and 28 more