Lecture notes on rough paths and applications to machine learning
Thomas Cass, Cristopher Salvi
TL;DR
The notes develop the signature transform as a robust, path-space representation built from iterated integrals and encoded via controlled differential equations. They establish a comprehensive algebraic and analytic framework, including Chen's relation and the shuffle identity, and define unparameterised path spaces with a topology suitable for learning. The composition with kernel methods yields universal, characteristic signature kernels and RKHSs, enabling regression, distribution regression, and hypothesis testing on path-valued data, with practical computation via PDE-based schemes and truncations. Finally, the notes bridge to rough path theory and deep learning, outlining neural rough differential equations and their training, thereby linking classical stochastic analysis with contemporary data-driven sequence modeling. The material provides both theoretical foundations and practical tools for applying signatures in time-series analysis, kernel learning, and neural differential equations. Practically, this framework supports efficient, invariant representations of streamed data and principled, scalable learning pipelines for sequential tasks.
Abstract
These notes expound the recent use of the signature transform and rough path theory in data science and machine learning. We develop the core theory of the signature from first principles and then survey some recent popular applications of this approach, including signature-based kernel methods and neural rough differential equations. The notes are based on a course given by the two authors at Imperial College London.
