Table of Contents
Fetching ...

Generative Modeling of Clinical Time Series via Latent Stochastic Differential Equations

Muhammad Aslanimoghanloo, Ahmed ElGazzar, Marcel van Gerven

TL;DR

Clinical time series in health care are marked by irregular sampling and uncertain disease progression, challenging accurate forecasting and decision making. The authors introduce a latent neural stochastic differential equation framework with a variational encoder–decoder to model continuous-time patient trajectories, enabling uncertainty-aware forecasting under arbitrary treatment plans. Across synthetic PKPD and real ICU data, the latent SDE consistently outperforms latent ODE and latent LSTM baselines, with particularly strong improvements in uncertainty calibration and robustness to noise and missing data. The approach holds promise for precision medicine by delivering accurate, probabilistic predictions that explicitly quantify uncertainty to guide clinical decisions.

Abstract

Clinical time series data from electronic health records and medical registries offer unprecedented opportunities to understand patient trajectories and inform medical decision-making. However, leveraging such data presents significant challenges due to irregular sampling, complex latent physiology, and inherent uncertainties in both measurements and disease progression. To address these challenges, we propose a generative modeling framework based on latent neural stochastic differential equations (SDEs) that views clinical time series as discrete-time partial observations of an underlying controlled stochastic dynamical system. Our approach models latent dynamics via neural SDEs with modality-dependent emission models, while performing state estimation and parameter learning through variational inference. This formulation naturally handles irregularly sampled observations, learns complex non-linear interactions, and captures the stochasticity of disease progression and measurement noise within a unified scalable probabilistic framework. We validate the framework on two complementary tasks: (i) individual treatment effect estimation using a simulated pharmacokinetic-pharmacodynamic (PKPD) model of lung cancer, and (ii) probabilistic forecasting of physiological signals using real-world intensive care unit (ICU) data from 12,000 patients. Results show that our framework outperforms ordinary differential equation and long short-term memory baseline models in accuracy and uncertainty estimation. These results highlight its potential for enabling precise, uncertainty-aware predictions to support clinical decision-making.

Generative Modeling of Clinical Time Series via Latent Stochastic Differential Equations

TL;DR

Clinical time series in health care are marked by irregular sampling and uncertain disease progression, challenging accurate forecasting and decision making. The authors introduce a latent neural stochastic differential equation framework with a variational encoder–decoder to model continuous-time patient trajectories, enabling uncertainty-aware forecasting under arbitrary treatment plans. Across synthetic PKPD and real ICU data, the latent SDE consistently outperforms latent ODE and latent LSTM baselines, with particularly strong improvements in uncertainty calibration and robustness to noise and missing data. The approach holds promise for precision medicine by delivering accurate, probabilistic predictions that explicitly quantify uncertainty to guide clinical decisions.

Abstract

Clinical time series data from electronic health records and medical registries offer unprecedented opportunities to understand patient trajectories and inform medical decision-making. However, leveraging such data presents significant challenges due to irregular sampling, complex latent physiology, and inherent uncertainties in both measurements and disease progression. To address these challenges, we propose a generative modeling framework based on latent neural stochastic differential equations (SDEs) that views clinical time series as discrete-time partial observations of an underlying controlled stochastic dynamical system. Our approach models latent dynamics via neural SDEs with modality-dependent emission models, while performing state estimation and parameter learning through variational inference. This formulation naturally handles irregularly sampled observations, learns complex non-linear interactions, and captures the stochasticity of disease progression and measurement noise within a unified scalable probabilistic framework. We validate the framework on two complementary tasks: (i) individual treatment effect estimation using a simulated pharmacokinetic-pharmacodynamic (PKPD) model of lung cancer, and (ii) probabilistic forecasting of physiological signals using real-world intensive care unit (ICU) data from 12,000 patients. Results show that our framework outperforms ordinary differential equation and long short-term memory baseline models in accuracy and uncertainty estimation. These results highlight its potential for enabling precise, uncertainty-aware predictions to support clinical decision-making.

Paper Structure

This paper contains 28 sections, 16 equations, 3 figures, 9 tables.

Figures (3)

  • Figure 1: Architecture of the latent SDE framework for time series modeling with external inputs. The model consists of three main components: (1) Data encoders that transform external inputs and observations into latent representations; (2) A latent space containing both a generative SDE and an augmented SDE that evolve the hidden state conditioned on the encoded inputs; and (3) A decoder that reconstructs the output in the data space. The framework enables both reconstruction of observed data and forecasting of future values by leveraging the continuous-time dynamics learned through the coupled SDE system.
  • Figure 2: Illustration of the synthetic PKPD dataset with latent SDE framework predictions. The figure shows three variables over a one-year simulation period: performance score (top), tumor size in unobserved latent space (middle), and cancer cell count (bottom). Blue circles indicate historical observations during the observation period, while orange circles with error bars show model predictions during the forecasting period. Vertical red lines indicate chemotherapy sessions, and vertical brown lines indicate radiotherapy sessions. The dashed black line separates the observation period (blue shaded) from the forecasting period (orange shaded). Shaded regions around predictions represent 95% confidence intervals, demonstrating the model's uncertainty quantification capability.
  • Figure 3: Sample ICU patient trajectories from the PhysioNet 2012 dataset. The figure shows three vital signs over 48 hours: heart rate (top), mean arterial pressure (middle), and body temperature (bottom). Blue circles represent observed measurements during the first 24 hours (observation period, blue shaded region), while orange circles show model predictions for the subsequent 24 hours (forecasting period, orange shaded region). Orange dashed lines indicate predicted mean values, and shaded regions represent 95% confidence intervals. The vertical dashed line separates the observation and forecasting periods, demonstrating the model's ability to forecast vital signs with calibrated uncertainty estimates.