Table of Contents
Fetching ...

Sparse Signature Coefficient Recovery via Kernels

Daniil Shmelev, Cristopher Salvi

TL;DR

This work addresses the challenge of extracting sparse high-depth signature coefficients efficiently. It introduces a kernel-based filtering framework that rewrites the desired coefficient as a linear combination of signature kernels, compute-enabled by a PDE-based kernel evaluation, and then isolates a single coefficient through anagram-class and order-isolation constructions using path scalings and axis paths. The authors establish exact and approximate schemes—via truncated kernels and Vandermonde systems—to recover $S(x)^I$ with favorable parallelizable complexity, and demonstrate applicability to sparse Euler schemes for CDEs and state-space models. The framework yields practical gains in computational speed for large signatures while preserving accuracy, and is complemented by rigorous proofs and numerical strategy for batch coefficient retrieval. Finally, a toy CDE example and an accompanying code implementation illustrate the method’s potential for real-time, sparse inference in sequential data settings.

Abstract

Central to rough path theory is the signature transform of a path, an infinite series of tensors given by the iterated integrals of the underlying path. The signature poses an effective way to capture sequentially ordered information, thanks both to its rich analytic and algebraic properties as well as its universality when used as a basis to approximate functions on path space. Whilst a truncated version of the signature can be efficiently computed using Chen's identity, there is a lack of efficient methods for computing a sparse collection of iterated integrals contained in high levels of the signature. We address this problem by leveraging signature kernels, defined as the inner product of two signatures, and computable efficiently by means of PDE-based methods. By forming a filter in signature space with which to take kernels, one can effectively isolate specific groups of signature coefficients and, in particular, a singular coefficient at any depth of the transform. We show that such a filter can be expressed as a linear combination of suitable signature transforms and demonstrate empirically the effectiveness of our approach. To conclude, we give an example use case for sparse collections of signature coefficients based on the construction of N-step Euler schemes for sparse CDEs.

Sparse Signature Coefficient Recovery via Kernels

TL;DR

This work addresses the challenge of extracting sparse high-depth signature coefficients efficiently. It introduces a kernel-based filtering framework that rewrites the desired coefficient as a linear combination of signature kernels, compute-enabled by a PDE-based kernel evaluation, and then isolates a single coefficient through anagram-class and order-isolation constructions using path scalings and axis paths. The authors establish exact and approximate schemes—via truncated kernels and Vandermonde systems—to recover with favorable parallelizable complexity, and demonstrate applicability to sparse Euler schemes for CDEs and state-space models. The framework yields practical gains in computational speed for large signatures while preserving accuracy, and is complemented by rigorous proofs and numerical strategy for batch coefficient retrieval. Finally, a toy CDE example and an accompanying code implementation illustrate the method’s potential for real-time, sparse inference in sequential data settings.

Abstract

Central to rough path theory is the signature transform of a path, an infinite series of tensors given by the iterated integrals of the underlying path. The signature poses an effective way to capture sequentially ordered information, thanks both to its rich analytic and algebraic properties as well as its universality when used as a basis to approximate functions on path space. Whilst a truncated version of the signature can be efficiently computed using Chen's identity, there is a lack of efficient methods for computing a sparse collection of iterated integrals contained in high levels of the signature. We address this problem by leveraging signature kernels, defined as the inner product of two signatures, and computable efficiently by means of PDE-based methods. By forming a filter in signature space with which to take kernels, one can effectively isolate specific groups of signature coefficients and, in particular, a singular coefficient at any depth of the transform. We show that such a filter can be expressed as a linear combination of suitable signature transforms and demonstrate empirically the effectiveness of our approach. To conclude, we give an example use case for sparse collections of signature coefficients based on the construction of N-step Euler schemes for sparse CDEs.

Paper Structure

This paper contains 20 sections, 18 theorems, 111 equations, 4 figures.

Key Result

Lemma 1.5

\newlabelthm:factorialdecay0 Let $x \in C_1([a,b], V)$ and $k \in \mathbb{N}$. Then where $\lVert x \rVert _{1,[a,b]}$ is the 1-variation of $x$ on $[a,b]$.

Figures (4)

  • Figure 1: Average errors when computing $S(x)^{(1,\ldots,n)}$ over 1,000 random paths constrained to $[0,1]^d$, with path length $L = 150$ and scaling depth $M = 3$. We compute average absolute errors (left) and average absolute errors scaled by average coefficient magnitude (right).
  • Figure 1: Dependence of runtime on coefficient depth, using serial versus parallel kernel computations (left, log scale), and parallel kernel computations versus a serial computation using Chen's relation (right, linear scale). Runtime is averaged over 10 random paths constrained to $[0,1]^d$, with path length $L = 10,000$, scaling depth $M = 1$ and PDE dyadic order set to $1$. For comparability, all computations are performed on an NVIDIA Tesla P100 GPU.
  • Figure 1: Generational model with $n = 13$ approximated using a $5$-step Euler scheme, leading to a sparsity of $s(\mathcal{E}_{s,t}^5) = 5 \times 10^{-5}$. Model parameters are chosen as $a(k) = 1$, $b(k) = f(k) = 10$, $c(k) = k / 14$, $d(k) = (14-k)/14$. Epidemic is chosen to never occur (top left), start late (top right), start mid-year (bottom left) or start early (bottom right).
  • Figure 2: Average errors when computing $S(x)^{(1,2,\ldots, n)}$ for random paths in $[0,1]^d$, with $L = 100$ (top), $L = 500$ (middle) and $L = 1000$ (bottom). Dyadic order of the PDE solver is set to 3.

Theorems & Definitions (57)

  • Definition 1.1
  • Definition 1.2
  • Definition 1.3: Signature Transform
  • Definition 1.4: Signature Coefficient
  • Lemma 1.5: Factorial Decay lyons2014rough
  • Proposition 1.6: cass2024lecture
  • Proposition 1.7: Chen's relation chen1954iterated
  • Definition 1.8: Hilbert-Schmidt inner product
  • Definition 1.9: Signature Kernel
  • Corollary 1.10
  • ...and 47 more