Table of Contents
Fetching ...

Conditioning non-linear and infinite-dimensional diffusion processes

Elizabeth Louise Baker, Gefan Yang, Michael L. Severinsen, Christy Anna Hipsley, Stefan Sommer

TL;DR

This paper conditions function valued stochastic processes without prior discretisation by using an infinite-dimensional version of Girsanov's theorem to condition a function-valued stochastic process, leading to a stochastic differential equation (SDE) for the conditioned process involving the score.

Abstract

Generative diffusion models and many stochastic models in science and engineering naturally live in infinite dimensions before discretisation. To incorporate observed data for statistical and learning tasks, one needs to condition on observations. While recent work has treated conditioning linear processes in infinite dimensions, conditioning non-linear processes in infinite dimensions has not been explored. This paper conditions function valued stochastic processes without prior discretisation. To do so, we use an infinite-dimensional version of Girsanov's theorem to condition a function-valued stochastic process, leading to a stochastic differential equation (SDE) for the conditioned process involving the score. We apply this technique to do time series analysis for shapes of organisms in evolutionary biology, where we discretise via the Fourier basis and then learn the coefficients of the score function with score matching methods.

Conditioning non-linear and infinite-dimensional diffusion processes

TL;DR

This paper conditions function valued stochastic processes without prior discretisation by using an infinite-dimensional version of Girsanov's theorem to condition a function-valued stochastic process, leading to a stochastic differential equation (SDE) for the conditioned process involving the score.

Abstract

Generative diffusion models and many stochastic models in science and engineering naturally live in infinite dimensions before discretisation. To incorporate observed data for statistical and learning tasks, one needs to condition on observations. While recent work has treated conditioning linear processes in infinite dimensions, conditioning non-linear processes in infinite dimensions has not been explored. This paper conditions function valued stochastic processes without prior discretisation. To do so, we use an infinite-dimensional version of Girsanov's theorem to condition a function-valued stochastic process, leading to a stochastic differential equation (SDE) for the conditioned process involving the score. We apply this technique to do time series analysis for shapes of organisms in evolutionary biology, where we discretise via the Fourier basis and then learn the coefficients of the score function with score matching methods.
Paper Structure (29 sections, 11 theorems, 46 equations, 8 figures, 2 tables)

This paper contains 29 sections, 11 theorems, 46 equations, 8 figures, 2 tables.

Key Result

Theorem 5.1

Let $h:[0, T]\times H \to \mathbb{R}_{>0}$ be a continuous function twice Fréchet differentiable with respect to $\xi \in H$ and once differentiable with respect to $t$, with continuous derivatives. Suppose $X$ is the strong solution to the stochastic differential equation in eq: sde. Moreover, we a where $\widehat{W}$ is the Wiener process with respect to the measure $\widehat{\mathbb{P}}$.

Figures (8)

  • Figure 1: We condition an SDE between two curves, representing two butterfly species, (starting from red dashed and ending at green dashed). Each time point of the trajectory represents a shape. Row 1: We take the mean over $20$ trajectories. Row 2: We plot the $20$ individual trajectories, used in the mean calculation.
  • Figure 2: A stochastic process between two butterfly outlines (Papilio polytes in red, Parnassius honrathi in blue).
  • Figure 3: Score fields evaluated on a selection of points at different time steps. In general, the score field is expected to "push" the shape towards the target. We show the learned score fields (black arrows) represented by varying numbers $N$ of base functions at different time steps, as well as the current shape (blue curves) and the target shape (red curves).
  • Figure 4: We use a dataset of 40 closely related butterflies with five different species. We find a mean across the dataset and plot single trajectories between the mean at time $t=0$ (in blue) and a specimen from each species at time $t=1$ (in red).
  • Figure 5: The five closely related species of Papilio, from left to right; Papilio Ambrax, Papilio Deiphobus, Papilio Protenor, Papilio Phestus and Papilio Polytes. A subset of the landmarks for each specimen is shown underneath each corresponding image.
  • ...and 3 more figures

Theorems & Definitions (19)

  • Theorem 5.1
  • proof
  • Lemma 5.2
  • proof
  • Lemma 5.3
  • Lemma 5.4
  • proof
  • Theorem C.1
  • Lemma C.2
  • proof
  • ...and 9 more