Table of Contents
Fetching ...

The Volterra signature

Paul P. Hager, Fabian N. Harang, Luca Pelizzari, Samy Tindel

TL;DR

An injectivity statement is proved (identifiability under augmentation) that leads to a universal approximation theorem on the infinite dimensional path space, which in certain cases is achieved by linear functionals of $\mathrm{VSig}(x;K)$.

Abstract

Modern approaches for learning from non-Markovian time series, such as recurrent neural networks, neural controlled differential equations or transformers, typically rely on implicit memory mechanisms that can be difficult to interpret or to train over long horizons. We propose the Volterra signature $\mathrm{VSig}(x;K)$ as a principled, explicit feature representation for history-dependent systems. By developing the input path $x$ weighted by a temporal kernel $K$ into the tensor algebra, we leverage the associated Volterra--Chen identity to derive rigorous learning-theoretic guarantees. Specifically, we prove an injectivity statement (identifiability under augmentation) that leads to a universal approximation theorem on the infinite dimensional path space, which in certain cases is achieved by linear functionals of $\mathrm{VSig}(x;K)$. Moreover, we demonstrate applicability of the kernel trick by showing that the inner product associated with Volterra signatures admits a closed characterization via a two-parameter integral equation, enabling numerical methods from PDEs for computation. For a large class of exponential-type kernels, $\mathrm{VSig}(x;K)$ solves a linear state-space ODE in the tensor algebra. Combined with inherent invariance to time reparameterization, these results position the Volterra signature as a robust, computationally tractable feature map for data science. We demonstrate its efficacy in dynamic learning tasks on real and synthetic data, where it consistently improves classical path signature baselines.

The Volterra signature

TL;DR

An injectivity statement is proved (identifiability under augmentation) that leads to a universal approximation theorem on the infinite dimensional path space, which in certain cases is achieved by linear functionals of .

Abstract

Modern approaches for learning from non-Markovian time series, such as recurrent neural networks, neural controlled differential equations or transformers, typically rely on implicit memory mechanisms that can be difficult to interpret or to train over long horizons. We propose the Volterra signature as a principled, explicit feature representation for history-dependent systems. By developing the input path weighted by a temporal kernel into the tensor algebra, we leverage the associated Volterra--Chen identity to derive rigorous learning-theoretic guarantees. Specifically, we prove an injectivity statement (identifiability under augmentation) that leads to a universal approximation theorem on the infinite dimensional path space, which in certain cases is achieved by linear functionals of . Moreover, we demonstrate applicability of the kernel trick by showing that the inner product associated with Volterra signatures admits a closed characterization via a two-parameter integral equation, enabling numerical methods from PDEs for computation. For a large class of exponential-type kernels, solves a linear state-space ODE in the tensor algebra. Combined with inherent invariance to time reparameterization, these results position the Volterra signature as a robust, computationally tractable feature map for data science. We demonstrate its efficacy in dynamic learning tasks on real and synthetic data, where it consistently improves classical path signature baselines.
Paper Structure (17 sections, 24 theorems, 193 equations, 3 figures, 2 tables)

This paper contains 17 sections, 24 theorems, 193 equations, 3 figures, 2 tables.

Key Result

Lemma 2.19

Let $\mathcal{V}^1([0,T];\mathbb{R}^m)$ be the space introduced in Definition def:volterra_path. For $z: \Delta^2\to\mathbb{R}^m$ it holds that $z \in \mathcal{V}^1([0,T];\mathbb{R}^m)$ if and only if it is of the form in for some $x\in\mathcal{C}^{0,1}([0,T];\mathbb{R}^d)$ and $K\in L^{\infty,1}(\Delta^2; \mathcal{L}(\mathbb{R}^d;\mathbb{R}^m))$.

Figures (3)

  • Figure 4.1: Volterra signature vs. classical signature expansions, compared with the fractional SDE solution \ref{['eq:SDE']} for one testing sample. The models were trained with $M=900$ training samples and $N=500$ time-steps on $[0,1]$ Parameters: $Y_0=1, b_0=0,b_1=-1,\sigma_0=1,\sigma_1=0.5$, signature truncation $L=5$.
  • Figure 4.2: Next-day forecast of realized S&P 500 volatility using our method VSig, compared with the benchmark HAR, on the test set. The lower subplot shows the absolute forecast errors $|\widehat{y}-y|$ for both methods, where $y$ denotes the realized volatility.
  • Figure 4.3: Left: Coefficient of determination $R^2$ as a (linearly interpolated) function of the past-window size $p$ (days), reported on the training and test sets for the methods VSig and Sig, and the (constant) benchmark HAR. Right: Scatter plots of realized volatility $y$ versus predictions $\widehat{y}$ for our method VSig and the HAR benchmark.

Theorems & Definitions (86)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Definition 2.5
  • Definition 2.6
  • Definition 2.7
  • Definition 2.8
  • Example 2.11
  • Example 2.12
  • ...and 76 more