Table of Contents
Fetching ...

Probabilistic Forecasting for Dynamical Systems with Missing or Imperfect Data

Siddharth Rout, Eldad Haber, Stéphane Gaudreault

TL;DR

This work tackles forecasting of dynamical systems under missing or noisy data by adopting probabilistic forecasts through stochastic interpolation (SI) and flow matching. It learns a velocity field $v_ heta(q_t,t)$ to transform samples from the initial distribution ${ m pi}_0$ to the future distribution ${ m pi}_T$ via the ODE $dq/dt = v_ heta(q,t)$, enabling generation of diverse future states. To ensure realistic perturbations of the initial state, the method employs a flow-based variational autoencoder to perturb ${ m q}_0$ and produce ensembles that are propagated forward. Experiments on Predator-Prey, MovingMNIST, WeatherBench, and additional datasets show that the resulting ensemble means and variances closely track true distributions, with favorable MSE, MAE, and SSIM metrics, demonstrating a scalable approach for uncertainty quantification in high-dimensional dynamical systems.

Abstract

The modeling of dynamical systems is essential in many fields, but applying machine learning techniques is often challenging due to incomplete or noisy data. This study introduces a variant of stochastic interpolation (SI) for probabilistic forecasting, estimating future states as distributions rather than single-point predictions. We explore its mathematical foundations and demonstrate its effectiveness on various dynamical systems, including the challenging WeatherBench dataset.

Probabilistic Forecasting for Dynamical Systems with Missing or Imperfect Data

TL;DR

This work tackles forecasting of dynamical systems under missing or noisy data by adopting probabilistic forecasts through stochastic interpolation (SI) and flow matching. It learns a velocity field to transform samples from the initial distribution to the future distribution via the ODE , enabling generation of diverse future states. To ensure realistic perturbations of the initial state, the method employs a flow-based variational autoencoder to perturb and produce ensembles that are propagated forward. Experiments on Predator-Prey, MovingMNIST, WeatherBench, and additional datasets show that the resulting ensemble means and variances closely track true distributions, with favorable MSE, MAE, and SSIM metrics, demonstrating a scalable approach for uncertainty quantification in high-dimensional dynamical systems.

Abstract

The modeling of dynamical systems is essential in many fields, but applying machine learning techniques is often challenging due to incomplete or noisy data. This study introduces a variant of stochastic interpolation (SI) for probabilistic forecasting, estimating future states as distributions rather than single-point predictions. We explore its mathematical foundations and demonstrate its effectiveness on various dynamical systems, including the challenging WeatherBench dataset.

Paper Structure

This paper contains 36 sections, 25 equations, 22 figures, 12 tables.

Figures (22)

  • Figure 1: Trajectories for the Predator Pray Model. Note that the trajectories get very close but do not intersect.
  • Figure 2: Left: The solution for ${\bf y}_1(0)=1$ and ${\bf y}_2(0) \sim U(0,1)$ at $t=200$. Right:The solution for ${\bf y}(0)=[0.1, 0.3]^{\top} + \epsilon$ where $\epsilon \sim N(0, 0.05 {\bf I})$ at $t=200$
  • Figure 3: Comparison of actual final distribution and that obtained using SI on the predator-prey Model. Trajectories for transport learned by SI are in blue. Note that the trajectories are not physical.
  • Figure 4: Six of 50 Moving MNIST trajectory predictions obtained using SI and their ensemble mean and standard deviation.
  • Figure 5: Two sample stochastic forecasts of U10 and T850 after 2 days obtained using SI and the ensemble mean and standard deviation for 78 forecasts.
  • ...and 17 more figures

Theorems & Definitions (4)

  • Definition 1
  • Example 1
  • Example 2
  • Definition 2