Table of Contents
Fetching ...

On Consistency of Signature Using Lasso

Xin Guo, Binnan Wang, Ruixun Zhang, Chaoyi Zhao

TL;DR

This work analyzes the statistical consistency of Lasso regression when applied to signature features of stochastic paths, contrasting Itô and Stratonovich definitions across Brownian motion, OU processes, and their discrete-time analogs. It establishes a probabilistic uniqueness of the signature-based linear representation, characterizes the correlation structures of signature components, and proves both asymptotic and finite-sample consistency under irrepresentability-type conditions. Simulations show Itô signatures are typically more consistent for Brownian-like data, while Stratonovich signatures perform better for mean-reverting dynamics; these insights extend to nonlinear learning and option pricing, where signature-based models yield high accuracy and interpretability via signature derivatives. The results provide practical guidance on selecting signature definitions and truncation orders to optimize predictive performance in time-series analysis and financial applications.

Abstract

Signatures are iterated path integrals of continuous and discrete-time processes, and their universal nonlinearity linearizes the problem of feature selection in time series data analysis. This paper studies the consistency of signature using Lasso regression, both theoretically and numerically. We establish conditions under which the Lasso regression is consistent both asymptotically and in finite sample. Furthermore, we show that the Lasso regression is more consistent with the Itô signature for time series and processes that are closer to the Brownian motion and with weaker inter-dimensional correlations, while it is more consistent with the Stratonovich signature for mean-reverting time series and processes. We demonstrate that signature can be applied to learn nonlinear functions and option prices with high accuracy, and the performance depends on properties of the underlying process and the choice of the signature.

On Consistency of Signature Using Lasso

TL;DR

This work analyzes the statistical consistency of Lasso regression when applied to signature features of stochastic paths, contrasting Itô and Stratonovich definitions across Brownian motion, OU processes, and their discrete-time analogs. It establishes a probabilistic uniqueness of the signature-based linear representation, characterizes the correlation structures of signature components, and proves both asymptotic and finite-sample consistency under irrepresentability-type conditions. Simulations show Itô signatures are typically more consistent for Brownian-like data, while Stratonovich signatures perform better for mean-reverting dynamics; these insights extend to nonlinear learning and option pricing, where signature-based models yield high accuracy and interpretability via signature derivatives. The results provide practical guidance on selecting signature definitions and truncation orders to optimize predictive performance in time-series analysis and financial applications.

Abstract

Signatures are iterated path integrals of continuous and discrete-time processes, and their universal nonlinearity linearizes the problem of feature selection in time series data analysis. This paper studies the consistency of signature using Lasso regression, both theoretically and numerically. We establish conditions under which the Lasso regression is consistent both asymptotically and in finite sample. Furthermore, we show that the Lasso regression is more consistent with the Itô signature for time series and processes that are closer to the Brownian motion and with weaker inter-dimensional correlations, while it is more consistent with the Stratonovich signature for mean-reverting time series and processes. We demonstrate that signature can be applied to learn nonlinear functions and option prices with high accuracy, and the performance depends on properties of the underlying process and the choice of the signature.
Paper Structure (43 sections, 20 theorems, 167 equations, 19 figures, 2 tables)

This paper contains 43 sections, 20 theorems, 167 equations, 19 figures, 2 tables.

Key Result

Theorem 1

Let $\mathbf{X}_t$ be a continuous $\mathbb{R}^d$-valued semimartingale and $\mathcal{S}$ be a compact subset of paths of the time-augmented process $\tilde{\mathbf{X}}_t = ^\top$ from time 0 to $T$. See Appendix appendix:timeaug for details of the time augmentation. Assume that $f: \mathcal{S} \to where $\mathrm{Sig}(s)$ is the signature of $s$.

Figures (19)

  • Figure 1: Consistency rates for the Brownian motion and the random walk with different values of inter-dimensional correlation $\rho$ and different numbers of true predictors $q$. Solid (dashed) lines correspond to the Itô (Stratonovich) signature.
  • Figure 2: OOS MSE for the Brownian motion and the random walk with different values of inter-dimensional correlation $\rho$ and different numbers of true predictors $q$. Solid (dashed) lines correspond to the Itô (Stratonovich) signature.
  • Figure 3: Consistency rates for the OU process and the AR(1) model with different parameters ($\kappa$ and $1-\phi$) and different numbers of true predictors $q$. Solid (dashed) lines correspond to the Itô (Stratonovich) signature.
  • Figure 4: In-sample and out-of-sample $R^2$ for learning option payoffs using different types of predictors.
  • Figure 5: Lasso paths with signatures as predictors.
  • ...and 14 more figures

Theorems & Definitions (31)

  • Definition 1: Signature
  • Theorem 1: Universal nonlinearity, cuchiero2023signature
  • Definition 2: Sign consistency
  • Definition 3: $l_\infty$ consistency
  • Definition 4: Irrepresentable condition
  • Theorem 2: Uniqueness
  • Proposition 1
  • Theorem 3
  • Proposition 2
  • Theorem 4
  • ...and 21 more