Sampling effects on Lasso estimation of drift functions in high-dimensional diffusion processes
Chiara Amorino, Francisco Pina, Mark Podolskij
TL;DR
The paper tackles high-dimensional drift estimation for diffusion processes observed at discrete times under a sparsity assumption. It develops an oracle inequality for the Lasso estimator by controlling three key probabilistic events—martingale fluctuations, discretization bias, and a compatibility condition—and shows that discretization can be negligible under suitable sampling, recovering the optimal continuous-observation rate. Two models are analyzed: a general linear drift and a canonical OU process, with concentration inequalities tailored to each setting (martingale-based and Malliavin-calculus-based, respectively). Theoretical results are complemented by numerical experiments demonstrating superior support recovery of the Lasso over MLE in high dimensions. Overall, the work provides precise finite-sample-type error bounds and clear guidance on how sampling, dimension, and sparsity interact to determine convergence rates in discretely observed high-dimensional diffusions.
Abstract
In this paper, we address high-dimensional parametric estimation of the drift function in diffusion models, specifically focusing on a $d$-dimensional ergodic diffusion process observed at discrete time points. We consider both a general linear form for the drift function and the particular case of the Ornstein-Uhlenbeck (OU) process. Assuming sparsity of the parameter vector, we examine the statistical behavior of the Lasso estimator for the unknown parameter. Our primary contribution is the proof of an oracle inequality for the Lasso estimator, which holds on the intersection of three specific sets defined for our analysis. We carefully control the probability of these sets, tackling the central challenge of our study. This approach allows us to derive error bounds for the $l_1$ and $l_2$ norms, assessing the performance of the proposed Lasso estimator. Our results demonstrate that, under certain conditions, the discretization error becomes negligible, enabling us to achieve the same optimal rate of convergence as if the continuous trajectory of the process were observed. We validate our theoretical findings through numerical experiments, which show that the Lasso estimator significantly outperforms the maximum likelihood estimator (MLE) in terms of support recovery.
