Table of Contents
Fetching ...

CRONOS: Continuous Time Reconstruction for 4D Medical Longitudinal Series

Nico Albert Disch, Saikat Roy, Constantin Ulrich, Yannick Kirchhoff, Maximilian Rokuss, Robin Peretzke, David Zimmerer, Klaus Maier-Hein

TL;DR

CRONOS introduces a unified flow-based framework for continuous spatio-temporal forecasting in 4D medical imaging, capable of predicting a target 3D volume from multiple past scans under both discrete and continuous timestamps. By treating context frames as a stack X0 and broadcasting the target as X1, CRONOS learns a shared velocity field to transport context volumes toward the target in voxel space, with continuous-time conditioning via Fourier-encoded timestamps. The method demonstrates state-of-the-art performance across Cine-MRI, perfusion CT, and longitudinal MRI datasets, while offering memory-efficient training and inference compared to diffusion-based approaches. A continuous variant with explicit time conditioning generally outperforms discrete counterparts, underscoring the value of real-valued temporal information in irregularly sampled longitudinal data. The work provides comprehensive benchmarks and reusable code, offering a robust foundation for future spatio-temporal modeling in clinical imaging and precision medicine.

Abstract

Forecasting how 3D medical scans evolve over time is important for disease progression, treatment planning, and developmental assessment. Yet existing models either rely on a single prior scan, fixed grid times, or target global labels, which limits voxel-level forecasting under irregular sampling. We present CRONOS, a unified framework for many-to-one prediction from multiple past scans that supports both discrete (grid-based) and continuous (real-valued) timestamps in one model, to the best of our knowledge the first to achieve continuous sequence-to-image forecasting for 3D medical data. CRONOS learns a spatio-temporal velocity field that transports context volumes toward a target volume at an arbitrary time, while operating directly in 3D voxel space. Across three public datasets spanning Cine-MRI, perfusion CT, and longitudinal MRI, CRONOS outperforms other baselines, while remaining computationally competitive. We will release code and evaluation protocols to enable reproducible, multi-dataset benchmarking of multi-context, continuous-time forecasting.

CRONOS: Continuous Time Reconstruction for 4D Medical Longitudinal Series

TL;DR

CRONOS introduces a unified flow-based framework for continuous spatio-temporal forecasting in 4D medical imaging, capable of predicting a target 3D volume from multiple past scans under both discrete and continuous timestamps. By treating context frames as a stack X0 and broadcasting the target as X1, CRONOS learns a shared velocity field to transport context volumes toward the target in voxel space, with continuous-time conditioning via Fourier-encoded timestamps. The method demonstrates state-of-the-art performance across Cine-MRI, perfusion CT, and longitudinal MRI datasets, while offering memory-efficient training and inference compared to diffusion-based approaches. A continuous variant with explicit time conditioning generally outperforms discrete counterparts, underscoring the value of real-valued temporal information in irregularly sampled longitudinal data. The work provides comprehensive benchmarks and reusable code, offering a robust foundation for future spatio-temporal modeling in clinical imaging and precision medicine.

Abstract

Forecasting how 3D medical scans evolve over time is important for disease progression, treatment planning, and developmental assessment. Yet existing models either rely on a single prior scan, fixed grid times, or target global labels, which limits voxel-level forecasting under irregular sampling. We present CRONOS, a unified framework for many-to-one prediction from multiple past scans that supports both discrete (grid-based) and continuous (real-valued) timestamps in one model, to the best of our knowledge the first to achieve continuous sequence-to-image forecasting for 3D medical data. CRONOS learns a spatio-temporal velocity field that transports context volumes toward a target volume at an arbitrary time, while operating directly in 3D voxel space. Across three public datasets spanning Cine-MRI, perfusion CT, and longitudinal MRI, CRONOS outperforms other baselines, while remaining computationally competitive. We will release code and evaluation protocols to enable reproducible, multi-dataset benchmarking of multi-context, continuous-time forecasting.

Paper Structure

This paper contains 43 sections, 21 equations, 8 figures, 9 tables, 1 algorithm.

Figures (8)

  • Figure 1: Task and benchmark comparison(a) Task setup Forecasting a target 3D scan from multiple past volumes in two regimes. Discrete: acquisitions lie approximately on a regular grid, but may contain missing frames (dotted boxes). Continuous: acquisitions occur at irregular, real-valued timestamps and are used directly without grid alignment. Many-to-one task $(\{I_i\}_{i=1}^T, t_\text{target}) \to I_\text{target}$. (b) Efficiency and performance Left: GPU memory scaling of single forward pass with sequence length $T$ shows CRONOS to be substantially more memory-efficient than alternatives. Right: Average SSIM across two datasets, where CRONOS outperforms baselines and LCI.
  • Figure 2: CRONOS method overview:Left: Discrete CRONOS treats time implicitly, interpolating between context frames and a fixed target along a normalized flow step $t \in [0,1]$. Right: Continuous CRONOS explicitly conditions on real-valued timestamps $t_i$, allowing each context $I_i$ to transport toward the target via its own interpolation $t_i$. This enables predictions at arbitrary target times while preserving the true temporal geometry.
  • Figure 3: Qualitative comparison on the ACDC dataset. Ground truth (GT), Last Context Image (LCI), our method (CRONOS), and SimVP. Upper row: prediction, lower row: residuals.
  • Figure 4: Qualitative comparison on the LUMIERE dataset. Ground truth (GT), Last Context Image (LCI), our method (CRONOS), and SimVP baseline. Lumiere is particularly challenging due the very small dataset. highlighting the benefit of explicit continuous-time conditioning under extreme data scarcity.
  • Figure 5: Network Flows: Top: input images at the first five timestamps. Middle: ground-truth voxel-wise differences ($|I_i - I_\text{target}|$. Bottom: predicted velocity fields $v_\theta(X_0,0)$, overlaid on the corresponding inputs. The highlighted regions coincide with the areas of the largest temporal changes (primarily the ventricular cavities and myocardial boundaries).
  • ...and 3 more figures