Table of Contents
Fetching ...

Bridging Performance Gaps for ECG Foundation Models: A Post-Training Strategy

Ya Zhou, Yujie Yang, Xiaohan Fan, Wei Zhao

TL;DR

ECG foundation models often underperform task-specific models after pretraining and fine-tuning, limiting clinical applicability. The authors propose a simple two-stage post-training strategy—initialization via linear probing and a regularization stage with stochastic depth—that can be applied to Transformer-based ECG models. On PTB-XL, this approach yields consistent improvements in macro AUROC (0.7%-8.9%) and macro AUPRC (23.3%-77.9%), often surpassing state-of-the-art task-specific methods, while also accelerating convergence and improving data efficiency. The findings support post-training as a practical mechanism to bridge the adaptation gap for ECG foundation models and encourage broader adoption and extension across architectures and datasets.

Abstract

ECG foundation models are increasingly popular due to their adaptability across various tasks. However, their clinical applicability is often limited by performance gaps compared to task-specific models, even after pre-training on large ECG datasets and fine-tuning on target data. This limitation is likely due to the lack of an effective post-training strategy. In this paper, we propose a simple yet effective post-training approach to enhance ECG foundation models. We evaluate it on a publicly available Transformer-based foundation model. Experiments across multiple ECG tasks show that our method consistently outperforms baseline fine-tuning. On the PTB-XL benchmarks, it improves macro AUROC by 0.7%-8.9% and macro AUPRC by 23.3%-77.9%, also outperforming several recent state-of-the-art approaches, including task-specific and advanced architectures. Further analyses demonstrate improved training dynamics and data efficiency, with only 30% of the training data outperforming the baseline trained on the full dataset. Ablation studies highlight the importance of stochastic depth and preview linear probing. These findings underscore the potential of post-training strategies to improve ECG foundation models, and we hope this work will contribute to the continued development of foundation models in the ECG domain.

Bridging Performance Gaps for ECG Foundation Models: A Post-Training Strategy

TL;DR

ECG foundation models often underperform task-specific models after pretraining and fine-tuning, limiting clinical applicability. The authors propose a simple two-stage post-training strategy—initialization via linear probing and a regularization stage with stochastic depth—that can be applied to Transformer-based ECG models. On PTB-XL, this approach yields consistent improvements in macro AUROC (0.7%-8.9%) and macro AUPRC (23.3%-77.9%), often surpassing state-of-the-art task-specific methods, while also accelerating convergence and improving data efficiency. The findings support post-training as a practical mechanism to bridge the adaptation gap for ECG foundation models and encourage broader adoption and extension across architectures and datasets.

Abstract

ECG foundation models are increasingly popular due to their adaptability across various tasks. However, their clinical applicability is often limited by performance gaps compared to task-specific models, even after pre-training on large ECG datasets and fine-tuning on target data. This limitation is likely due to the lack of an effective post-training strategy. In this paper, we propose a simple yet effective post-training approach to enhance ECG foundation models. We evaluate it on a publicly available Transformer-based foundation model. Experiments across multiple ECG tasks show that our method consistently outperforms baseline fine-tuning. On the PTB-XL benchmarks, it improves macro AUROC by 0.7%-8.9% and macro AUPRC by 23.3%-77.9%, also outperforming several recent state-of-the-art approaches, including task-specific and advanced architectures. Further analyses demonstrate improved training dynamics and data efficiency, with only 30% of the training data outperforming the baseline trained on the full dataset. Ablation studies highlight the importance of stochastic depth and preview linear probing. These findings underscore the potential of post-training strategies to improve ECG foundation models, and we hope this work will contribute to the continued development of foundation models in the ECG domain.

Paper Structure

This paper contains 20 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overview of the proposed post-training framework for ECG classification.
  • Figure 2: Per-label comparison of AUROC and AUPRC differences between the proposed post-training strategy and the baseline (Transformer-FM-PT $-$ Transformer-PT) across 71 ECG diagnoses. Bars represent the observed differences on the test set, and vertical error bars indicate the 95% confidence intervals estimated from 1,000 bootstrap resamples.
  • Figure 3: Validation set performance of the proposed post-training strategy compared with the baseline over 50 training epochs for the all-71 task. (a) AUROC and (b) AUPRC are shown for both methods. The early stopping epoch, determined based on the validation AUROC, is highlighted with a star marker.
  • Figure 4: Test set performance of the proposed post-training strategy under varying proportions of training data. (a) AUROC and (b) AUPRC are reported for 10%, 20%, ..., up to 100% of the training data. The dashed horizontal line indicates the baseline performance trained using 100% of the training data.
  • Figure 5: Validation set performance of the proposed post-training strategy for the ablation study on the all-71 task. (a) AUROC and (b) AUPRC are shown for different ablation settings. Bars represent performance for each setting, with color intensity reflecting the relative magnitude of each metric. Full denotes using the complete post-training strategy; w/o SD denotes removing the stochastic depth strategy; w/o LP denotes removing the linear probing initialization strategy.