SigDiffusions: Score-Based Diffusion Models for Time Series via Log-Signature Embeddings
Barbora Barancikova, Zhuoyue Huang, Cristopher Salvi
TL;DR
SigDiffusions introduces a novel diffusion framework that operates on log-signature embeddings to generate long multivariate time series while preserving the algebraic structure of signatures. It combines forward-noise diffusion in the Lie algebra $\mathcal{L}^n(\mathbb{R}^d)$ with newly derived closed-form inversion formulae, enabling exact reconstruction of paths from log-signatures using Fourier or orthogonal polynomial bases. Empirical results on synthetic and real datasets demonstrate competitive performance against state-of-the-art diffusion models, supported by detailed inversion evaluations and model-capacity analyses. The work paves the way for efficient, scalable time-series generation in continuous time, with future directions including alternative path-embeddings and discrete-time signature approaches.
Abstract
Score-based diffusion models have recently emerged as state-of-the-art generative models for a variety of data modalities. Nonetheless, it remains unclear how to adapt these models to generate long multivariate time series. Viewing a time series as the discretisation of an underlying continuous process, we introduce SigDiffusion, a novel diffusion model operating on log-signature embeddings of the data. The forward and backward processes gradually perturb and denoise log-signatures while preserving their algebraic structure. To recover a signal from its log-signature, we provide new closed-form inversion formulae expressing the coefficients obtained by expanding the signal in a given basis (e.g. Fourier or orthogonal polynomials) as explicit polynomial functions of the log-signature. Finally, we show that combining SigDiffusions with these inversion formulae results in high-quality long time series generation, competitive with the current state-of-the-art on various datasets of synthetic and real-world examples.
