Table of Contents
Fetching ...

MIRA: Medical Time Series Foundation Model for Real-World Health Data

Hao Li, Bowen Deng, Chang Xu, Zhiyuan Feng, Viktor Schlegel, Yu-Hao Huang, Yizheng Sun, Jingyuan Sun, Kailai Yang, Yiyao Yu, Jiang Bian

TL;DR

MIRA introduces a medical time series foundation model optimized for irregular data by integrating Continuous-Time Rotary Positional Encoding, a frequency-aware Sparse MoE, and a Neural ODE-based extrapolation module. Trained on a vast corpus of 454 billion time points from public datasets, it achieves strong zero-shot performance across out-of-distribution and in-distribution tasks, highlighting the value of medical-domain pretraining and continuous-time reasoning. The work provides a detailed methodology, extensive pretraining data, and a comprehensive benchmark to advance robust, cross-institution medical time series forecasting. Overall, MIRA demonstrates scalable, temporally adaptive forecasting that reduces annotation needs and transfers more effectively across clinical settings, modalities, and tasks.

Abstract

A unified foundation model for medical time series -- pretrained on open access and ethics board-approved medical corpora -- offers the potential to reduce annotation burdens, minimize model customization, and enable robust transfer across clinical institutions, modalities, and tasks, particularly in data-scarce or privacy-constrained environments. However, existing generalist time series foundation models struggle to handle medical time series data due to their inherent challenges, including irregular intervals, heterogeneous sampling rates, and frequent missing values. To address these challenges, we introduce MIRA, a unified foundation model specifically designed for medical time series forecasting. MIRA incorporates a Continuous-Time Rotary Positional Encoding that enables fine-grained modeling of variable time intervals, a frequency-specific mixture-of-experts layer that routes computation across latent frequency regimes to further promote temporal specialization, and a Continuous Dynamics Extrapolation Block based on Neural ODE that models the continuous trajectory of latent states, enabling accurate forecasting at arbitrary target timestamps. Pretrained on a large-scale and diverse medical corpus comprising over 454 billion time points collect from publicly available datasets, MIRA achieves reductions in forecasting errors by an average of 10% and 7% in out-of-distribution and in-distribution scenarios, respectively, when compared to other zero-shot and fine-tuned baselines. We also introduce a comprehensive benchmark spanning multiple downstream clinical tasks, establishing a foundation for future research in medical time series modeling.

MIRA: Medical Time Series Foundation Model for Real-World Health Data

TL;DR

MIRA introduces a medical time series foundation model optimized for irregular data by integrating Continuous-Time Rotary Positional Encoding, a frequency-aware Sparse MoE, and a Neural ODE-based extrapolation module. Trained on a vast corpus of 454 billion time points from public datasets, it achieves strong zero-shot performance across out-of-distribution and in-distribution tasks, highlighting the value of medical-domain pretraining and continuous-time reasoning. The work provides a detailed methodology, extensive pretraining data, and a comprehensive benchmark to advance robust, cross-institution medical time series forecasting. Overall, MIRA demonstrates scalable, temporally adaptive forecasting that reduces annotation needs and transfers more effectively across clinical settings, modalities, and tasks.

Abstract

A unified foundation model for medical time series -- pretrained on open access and ethics board-approved medical corpora -- offers the potential to reduce annotation burdens, minimize model customization, and enable robust transfer across clinical institutions, modalities, and tasks, particularly in data-scarce or privacy-constrained environments. However, existing generalist time series foundation models struggle to handle medical time series data due to their inherent challenges, including irregular intervals, heterogeneous sampling rates, and frequent missing values. To address these challenges, we introduce MIRA, a unified foundation model specifically designed for medical time series forecasting. MIRA incorporates a Continuous-Time Rotary Positional Encoding that enables fine-grained modeling of variable time intervals, a frequency-specific mixture-of-experts layer that routes computation across latent frequency regimes to further promote temporal specialization, and a Continuous Dynamics Extrapolation Block based on Neural ODE that models the continuous trajectory of latent states, enabling accurate forecasting at arbitrary target timestamps. Pretrained on a large-scale and diverse medical corpus comprising over 454 billion time points collect from publicly available datasets, MIRA achieves reductions in forecasting errors by an average of 10% and 7% in out-of-distribution and in-distribution scenarios, respectively, when compared to other zero-shot and fine-tuned baselines. We also introduce a comprehensive benchmark spanning multiple downstream clinical tasks, establishing a foundation for future research in medical time series modeling.

Paper Structure

This paper contains 36 sections, 1 theorem, 26 equations, 3 figures, 7 tables, 1 algorithm.

Key Result

Theorem 1

Let $f_{\text{ODE}}$ be spectrally normalized with maximum singular value $\sigma_{\max}$, and Lipschitz continuous with constant $L=\sigma_{\max}$. Then for $\Delta t = t_{N+1}-t_N>0$, the state evolution admits a unique solution satisfying

Figures (3)

  • Figure 1: Medical time series exhibit ① irregular intervals, ② heterogeneous sampling rates, and ③ frequent missingness driven by clinical workflows.
  • Figure 2: Architecture of MIRA. ① Takes irregular medical time series and timestamps as input, applying CT-RoPE for continuous temporal encoding. ② A Sparse Temporal Mixture-of-Experts layer routes tokens to specialized experts based on frequency. ③ A Continuous Dynamics Extrapolation Block evolves latent states toward arbitrary target timestamps for flexible time-aware forecasting.
  • Figure 3: Gating scores for experts across different layers in the three different frequency datasets.

Theorems & Definitions (1)

  • Theorem 1: Existence and Boundedness