Table of Contents
Fetching ...

Longitudinal Flow Matching for Trajectory Modeling

Mohammad Mohaiminul Islam, Thijs P. Kuipers, Sharvaree Vadgama, Coen de Vente, Afsana Khan, Clara I. Sánchez, Erik J. Bekkers

TL;DR

IMMFM tackles the problem of modeling high-dimensional, sparsely observed longitudinal trajectories by learning a continuous stochastic flow that aligns with multiple timepoints. It introduces a piecewise-quadratic conditional path and jointly learns drift $v_\theta$ and diffusion $g_\theta$, augmented by a data-driven diffusion to capture uncertainty, with MMOT-based handling of incomplete trajectories. The method achieves superior forecasting and downstream task performance on synthetic benchmarks and real longitudinal neuroimaging data (e.g., ADNI, GBM, MS), with notable gains from the uncertainty-aware SU-IMMFM variant and 3D extensions. The work advances subject-specific trajectory modeling under irregular sampling, offering practical impact for prognosis and treatment planning in clinical settings, and opens directions toward temporally aware latent spaces and physics-informed constraints.

Abstract

Generative models for sequential data often struggle with sparsely sampled and high-dimensional trajectories, typically reducing the learning of dynamics to pairwise transitions. We propose Interpolative Multi-Marginal Flow Matching (IMMFM), a framework that learns continuous stochastic dynamics jointly consistent with multiple observed time points. IMMFM employs a piecewise-quadratic interpolation path as a smooth target for flow matching and jointly optimizes drift and a data-driven diffusion coefficient, supported by a theoretical condition for stable learning. This design captures intrinsic stochasticity, handles irregular sparse sampling, and yields subject-specific trajectories. Experiments on synthetic benchmarks and real-world longitudinal neuroimaging datasets show that IMMFM outperforms existing methods in both forecasting accuracy and further downstream tasks.

Longitudinal Flow Matching for Trajectory Modeling

TL;DR

IMMFM tackles the problem of modeling high-dimensional, sparsely observed longitudinal trajectories by learning a continuous stochastic flow that aligns with multiple timepoints. It introduces a piecewise-quadratic conditional path and jointly learns drift and diffusion , augmented by a data-driven diffusion to capture uncertainty, with MMOT-based handling of incomplete trajectories. The method achieves superior forecasting and downstream task performance on synthetic benchmarks and real longitudinal neuroimaging data (e.g., ADNI, GBM, MS), with notable gains from the uncertainty-aware SU-IMMFM variant and 3D extensions. The work advances subject-specific trajectory modeling under irregular sampling, offering practical impact for prognosis and treatment planning in clinical settings, and opens directions toward temporally aware latent spaces and physics-informed constraints.

Abstract

Generative models for sequential data often struggle with sparsely sampled and high-dimensional trajectories, typically reducing the learning of dynamics to pairwise transitions. We propose Interpolative Multi-Marginal Flow Matching (IMMFM), a framework that learns continuous stochastic dynamics jointly consistent with multiple observed time points. IMMFM employs a piecewise-quadratic interpolation path as a smooth target for flow matching and jointly optimizes drift and a data-driven diffusion coefficient, supported by a theoretical condition for stable learning. This design captures intrinsic stochasticity, handles irregular sparse sampling, and yields subject-specific trajectories. Experiments on synthetic benchmarks and real-world longitudinal neuroimaging datasets show that IMMFM outperforms existing methods in both forecasting accuracy and further downstream tasks.

Paper Structure

This paper contains 43 sections, 4 theorems, 47 equations, 9 figures, 6 tables, 2 algorithms.

Key Result

Proposition 3.1

Under mild regularity conditions for all $x\in\mathbb R^{d}$, $t\in[0,1]$, every stationary point of $\mathcal{L}_{\mathrm{SDE}}$ (Eq. eq:flow_score_matching_loss) is a stationary point of $\mathcal{L}_{\mathrm{IMMFM}}$ (Eq. eq:IMMFM-complete).

Figures (9)

  • Figure 1: Overview of modeling problem and data. (a) Example of a sparsely and irregularly observed trajectory of disease progression over time. (b) IMMFM estimates positional uncertainty that informs the SDE's data-driven diffusion term. (c) IMMFM takes as input the position $x_t$, time $t$, and conditional variables $C$ and predicts the velocity $v_\theta$, diffusion term $g_\theta$, and uncertainty $S_\theta$.
  • Figure 2: Trajectories on synthetic S-shaped (top row) and $\sigma$-shaped (bottom row) Gaussian datasets. Colored dots show subsets of training samples, and grey lines show the predicted trajectories. From left to right: TFM zhang2024trajectory, L-MMFM rohbeckmodeling, MMFM rohbeckmodeling, and IMMFM (ours).
  • Figure 3: Trajectory on Starmen dataset. The conditioning frame is marked with green, and the reference starting frame is marked with blue. On the left Hand-downward motion, on the right Hand-upward motion.
  • Figure 4: Visual comparison of forecasting results on the Alzheimer's (ADNI), the first row displays the forecasted image. The second row shows the corresponding pixel-wise difference map between the forecast and the ground truth. The different evaluation metrics DSC, HD, and PSNR are listed at the top.
  • Figure 5: (a) Ground truth and predicted mean ventricle growth over time. (b) Ventricular areas at the second visit ($\sim$18 months) versus model-predicted areas at the last visit ($\sim$36 months) for Alzheimer’s (AD) and cognitively normal (CN) subjects.
  • ...and 4 more figures

Theorems & Definitions (7)

  • Proposition 3.1: Gradient equivalence at stationary points
  • Lemma 3.1: Zero-mean residual
  • proof
  • proof
  • Proposition 3.2: Diffeomorphic MMOT Decomposition
  • Proposition : 3.2
  • proof