Table of Contents
Fetching ...

Physiome-ODE: A Benchmark for Irregularly Sampled Multivariate Time Series Forecasting Based on Biological ODEs

Christian Klötergens, Vijaya Krishna Yalavarthi, Randolf Scholz, Maximilian Stubbemann, Stefan Born, Lars Schmidt-Thieme

TL;DR

Physiome-ODE addresses the lack of robust IMTS benchmarks by constructing a large, biologically grounded semi-synthetic suite of 50 ODE-derived datasets with controlled irregular sampling and noise. It introduces Joint Gradient Deviation (JGD) to quantify dataset difficulty and uses it to curate challenging IMTS instances, enabling meaningful evaluation of ODE-based forecasting models. Across experiments, neural ODEs and graph-based methods show variable performance across datasets, with no single model dominating; importantly, some datasets reveal clear advantages for channel-dependent models, while others remain solvable by simple baselines. The benchmark advances IMTS research by providing standardized, scalable, and diverse data, along with reproducible protocols and code, to better assess model capabilities and guide future architectural developments in time-series forecasting.

Abstract

State-of-the-art methods for forecasting irregularly sampled time series with missing values predominantly rely on just four datasets and a few small toy examples for evaluation. While ordinary differential equations (ODE) are the prevalent models in science and engineering, a baseline model that forecasts a constant value outperforms ODE-based models from the last five years on three of these existing datasets. This unintuitive finding hampers further research on ODE-based models, a more plausible model family. In this paper, we develop a methodology to generate irregularly sampled multivariate time series (IMTS) datasets from ordinary differential equations and to select challenging instances via rejection sampling. Using this methodology, we create Physiome-ODE, a large and sophisticated benchmark of IMTS datasets consisting of 50 individual datasets, derived from real-world ordinary differential equations from research in biology. Physiome-ODE is the first benchmark for IMTS forecasting that we are aware of and an order of magnitude larger than the current evaluation setting of four datasets. Using our benchmark Physiome-ODE, we show qualitatively completely different results than those derived from the current four datasets: on Physiome-ODE ODE-based models can play to their strength and our benchmark can differentiate in a meaningful way between different IMTS forecasting models. This way, we expect to give a new impulse to research on ODE-based time series modeling.

Physiome-ODE: A Benchmark for Irregularly Sampled Multivariate Time Series Forecasting Based on Biological ODEs

TL;DR

Physiome-ODE addresses the lack of robust IMTS benchmarks by constructing a large, biologically grounded semi-synthetic suite of 50 ODE-derived datasets with controlled irregular sampling and noise. It introduces Joint Gradient Deviation (JGD) to quantify dataset difficulty and uses it to curate challenging IMTS instances, enabling meaningful evaluation of ODE-based forecasting models. Across experiments, neural ODEs and graph-based methods show variable performance across datasets, with no single model dominating; importantly, some datasets reveal clear advantages for channel-dependent models, while others remain solvable by simple baselines. The benchmark advances IMTS research by providing standardized, scalable, and diverse data, along with reproducible protocols and code, to better assess model capabilities and guide future architectural developments in time-series forecasting.

Abstract

State-of-the-art methods for forecasting irregularly sampled time series with missing values predominantly rely on just four datasets and a few small toy examples for evaluation. While ordinary differential equations (ODE) are the prevalent models in science and engineering, a baseline model that forecasts a constant value outperforms ODE-based models from the last five years on three of these existing datasets. This unintuitive finding hampers further research on ODE-based models, a more plausible model family. In this paper, we develop a methodology to generate irregularly sampled multivariate time series (IMTS) datasets from ordinary differential equations and to select challenging instances via rejection sampling. Using this methodology, we create Physiome-ODE, a large and sophisticated benchmark of IMTS datasets consisting of 50 individual datasets, derived from real-world ordinary differential equations from research in biology. Physiome-ODE is the first benchmark for IMTS forecasting that we are aware of and an order of magnitude larger than the current evaluation setting of four datasets. Using our benchmark Physiome-ODE, we show qualitatively completely different results than those derived from the current four datasets: on Physiome-ODE ODE-based models can play to their strength and our benchmark can differentiate in a meaningful way between different IMTS forecasting models. This way, we expect to give a new impulse to research on ODE-based time series modeling.

Paper Structure

This paper contains 32 sections, 3 theorems, 39 equations, 2 figures, 5 tables.

Key Result

Lemma 1

For a function $x \in \mathcal{C}^{1}([0,T])$ and $\epsilon > 0$ such that $\frac{T}{\epsilon} \in \mathbb{N}$ we consider the divided differences It then holds that the numerical estimator ${\mathop{\mathrm{\widehat{std}}}\nolimits[(x_k)_{k=1:K}] \coloneqq \sqrt{\frac{1}{K}\sum_{k=1}^{K} (x_k - \bar{x})^2 }}$ of the standard deviation of the divided differences converges to the $\mathop{\mathrm{

Figures (2)

  • Figure 1: Demonstration of time series realized by 4 ODEs of different prediction difficulties without adding any noise. Each line / color represents a channel. Trajectories are shown for a duration of the respective $\sigma_\text{dur}$ as shown in \ref{['tab:ds-info']}.
  • Figure 2: Test MSE of the best performing model vs $\mathop{\mathrm{JGD}}\nolimits$-score across 50 datasets.

Theorems & Definitions (6)

  • Lemma 1
  • Lemma 2
  • proof
  • Lemma 3: Uniform Law of Large Numbers
  • proof
  • proof