Table of Contents
Fetching ...

Repurposing Foundation Model for Generalizable Medical Time Series Classification

Nan Huang, Haishuai Wang, Zihuai He, Marinka Zitnik, Xiang Zhang

TL;DR

The paper tackles the challenge of generalizing medical time series (MedTS) classification across heterogeneous datasets by introducing FORMED, a framework that repurposes a foundation time-series model. It decouples domain-agnostic representation learning (frozen backbone and a shared decoding attention) from task-specific adaptation (Channel Embeddings and Label Queries), enabling efficient updates with only a tiny parameter budget. In experiments on five MedTS datasets, FORMED achieves state-of-the-art performance on unseen subjects and demonstrates scalable, data-efficient adaptation to unseen tasks, outperforming multiple TSM and TSA baselines (up to 35% absolute improvement on the ADFTD dataset). This work offers a practical pathway for deploying foundation models in healthcare with limited data and diverse recording configurations, emphasizing generalizability and resource efficiency.

Abstract

Medical time series (MedTS) classification suffers from poor generalizability in real-world deployment due to inter- and intra-dataset heterogeneity, such as varying numbers of channels, signal lengths, task definitions, and patient characteristics. To address this, we propose FORMED, a novel framework for repurposing a backbone foundation model, pre-trained on generic time series, to enable highly generalizable MedTS classification on unseen datasets. FORMED combines the backbone with a novel classifier comprising two components: (1) task-specific channel embeddings and label queries, dynamically sized to match any number of channels and target classes, and (2) a shared decoding attention layer, jointly trained across datasets to capture medical domain knowledge through task-agnostic feature-query interactions. After repurposing, FORMED achieves seamless adaptation to unseen MedTS datasets through lightweight label query training (0.1% of parameters), eliminating the need for full fine-tuning or architectural redesign. We evaluate FORMED on 5 diverse MedTS datasets, benchmarking against 11 Task-Specific Models (TSM) and 4 Task-Specific Adaptation (TSA) methods. Our results demonstrate FORMED's dominant performance, achieving up to 35% absolute improvement in F1-score (on ADFTD dataset) over specialized baselines. Further analysis reveals consistent generalization across varying channel configurations, time series lengths, and clinical tasks, which are key challenges in real-world deployment. By decoupling domain-invariant representation learning from task-specific adaptation, FORMED establishes a scalable and resource-efficient paradigm for foundation model repurposing in healthcare. This approach prioritizes clinical adaptability over rigid task-centric design, offering a practical pathway for real-world implementation.

Repurposing Foundation Model for Generalizable Medical Time Series Classification

TL;DR

The paper tackles the challenge of generalizing medical time series (MedTS) classification across heterogeneous datasets by introducing FORMED, a framework that repurposes a foundation time-series model. It decouples domain-agnostic representation learning (frozen backbone and a shared decoding attention) from task-specific adaptation (Channel Embeddings and Label Queries), enabling efficient updates with only a tiny parameter budget. In experiments on five MedTS datasets, FORMED achieves state-of-the-art performance on unseen subjects and demonstrates scalable, data-efficient adaptation to unseen tasks, outperforming multiple TSM and TSA baselines (up to 35% absolute improvement on the ADFTD dataset). This work offers a practical pathway for deploying foundation models in healthcare with limited data and diverse recording configurations, emphasizing generalizability and resource efficiency.

Abstract

Medical time series (MedTS) classification suffers from poor generalizability in real-world deployment due to inter- and intra-dataset heterogeneity, such as varying numbers of channels, signal lengths, task definitions, and patient characteristics. To address this, we propose FORMED, a novel framework for repurposing a backbone foundation model, pre-trained on generic time series, to enable highly generalizable MedTS classification on unseen datasets. FORMED combines the backbone with a novel classifier comprising two components: (1) task-specific channel embeddings and label queries, dynamically sized to match any number of channels and target classes, and (2) a shared decoding attention layer, jointly trained across datasets to capture medical domain knowledge through task-agnostic feature-query interactions. After repurposing, FORMED achieves seamless adaptation to unseen MedTS datasets through lightweight label query training (0.1% of parameters), eliminating the need for full fine-tuning or architectural redesign. We evaluate FORMED on 5 diverse MedTS datasets, benchmarking against 11 Task-Specific Models (TSM) and 4 Task-Specific Adaptation (TSA) methods. Our results demonstrate FORMED's dominant performance, achieving up to 35% absolute improvement in F1-score (on ADFTD dataset) over specialized baselines. Further analysis reveals consistent generalization across varying channel configurations, time series lengths, and clinical tasks, which are key challenges in real-world deployment. By decoupling domain-invariant representation learning from task-specific adaptation, FORMED establishes a scalable and resource-efficient paradigm for foundation model repurposing in healthcare. This approach prioritizes clinical adaptability over rigid task-centric design, offering a practical pathway for real-world implementation.
Paper Structure (21 sections, 11 equations, 11 figures, 5 tables)

This paper contains 21 sections, 11 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Paradigms of building models for different MedTS classification tasks. Task-Specific Model (TSM): Traditional classification models are designed for specific input shape and output classes, thus require retraining from scratch for each new dataset. Task-Specific Adaptation (TSA): By using a pre-trained and fixed backbone foundation models, the adaptation to new datasets requires training fewer parameters for each dataset, such as pre- and post-backbone adapters, which makes the combined model no longer applicable to other tasks, lacking generalization across tasks, and more prone to overfitting. Generalizable Adaptation (GA): Generalizable adaptation is a post-backbone adaptation module that is shared across tasks of different datasets, which carries domain knowledge and transferable to unseen datasets with training of lightweight task-specific parameters, offering both generalizability and robustness against overfitting.
  • Figure 2: The three-stage process of adapting a time series foundation model for MedTS classification tasks. 1) Pre-training is already done on diverse general time series datasets with forecasting tasks. 2) Repurposing the foundation model involves changing the forecasting head to a classification head, while keeping the rest of the model fixed, and the new model is then trained on a cohort of MedTS datasets to capture domain knowledge in MedTS. 3) Adapting the repurposed model to the new MedTS datasets, where only the minimal task-specific parameters are trained, leveraging the previously learned domain knowledge from the repurposed model.
  • Figure 3: The architecture of the proposed model in repurposing and adapting. The backbone foundation model acts as a feature extractor and remains frozen all the time. The Channel Embeddings (CEs) and Label Queries (LQs) are task-specific parameters that are learned during both repurposing and adapting, and new ones will be created and learned if encountering new datasets. The Shared Decoding Attention (SDA) is a shared Transformer decoder layer that captures the interaction between all the features and classes, which once get trained on curated MedTS datasets $\mathcal{D}^{\text{Med}}$ during repurposing, will be fixed and reused when adapting to all future datasets and tasks $\mathcal{D}^{\text{New}}$. The $\oplus$ denotes broadcast addition.
  • Figure 4: In-domain F1 performance on the MedTS cohort datasets. FORMED achieves SOTA level performance across all datasets in all metrics. Numerical results are shown in \ref{['tab:full-result']}. Other metrics are included in \ref{['sec:experiment-results']}.
  • Figure 5: Adapt-time scaling on unseen, out-of-domain dataset. FORMED's performance scales well with $k$ following power law, and outperforms TimesFM-TSA starting from $k=64$ on ECG200, and from $k=16$ on StandWalkJump. Numerical results see \ref{['tab:adaptation']}.
  • ...and 6 more figures

Theorems & Definitions (2)

  • Definition 3.1
  • Definition 3.2