Table of Contents
Fetching ...

CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models

Yuxuan Shu, Peter H. Charlton, Fahim Kawsar, Jussi Hernesniemi, Mohammad Malekzadeh

TL;DR

CLEF introduces a clinically-guided contrastive pretraining approach for ECG foundation models by leveraging SCORE2-based risk scores to adaptively weight negative pairs and by aligning embedding dissimilarities with clinically meaningful risk differences. The method handles missing metadata and demonstrates robust improvements over strong self-supervised baselines across multiple downstream tasks and datasets, achieving competitive performance with supervised ECGFounder when pretraining leads align. This enables more accurate, scalable single-lead ECG analysis using unlabeled data with readily available metadata, advancing remote health monitoring. The work also provides extensive ablations and establishes a framework for incorporating domain knowledge into contrastive learning for biomedical signals.

Abstract

The electrocardiogram (ECG) is a key diagnostic tool in cardiovascular health. Single-lead ECG recording is integrated into both clinical-grade and consumer wearables. While self-supervised pretraining of foundation models on unlabeled ECGs improves diagnostic performance, existing approaches do not incorporate domain knowledge from clinical metadata. We introduce a novel contrastive learning approach that utilizes an established clinical risk score to adaptively weight negative pairs: clinically-guided contrastive learning. It aligns the similarities of ECG embeddings with clinically meaningful differences between subjects, with an explicit mechanism to handle missing metadata. On 12-lead ECGs from 161K patients in the MIMIC-IV dataset, we pretrain single-lead ECG foundation models at three scales, collectively called CLEF, using only routinely collected metadata without requiring per-sample ECG annotations. We evaluate CLEF on 18 clinical classification and regression tasks across 7 held-out datasets, and benchmark against 5 foundation model baselines and 3 self-supervised algorithms. When pretrained on 12-lead ECG data and tested on lead-I data, CLEF outperforms self-supervised foundation model baselines: the medium-sized CLEF achieves average AUROC improvements of at least 2.6% in classification and average reductions in MAEs of at least 3.2% in regression. Comparing with existing self-supervised learning algorithms, CLEF improves the average AUROC by at least 1.8%. Moreover, when pretrained only on lead-I data for classification tasks, CLEF performs comparably to the state-of-the-art ECGFounder, which was trained in a supervised manner. Overall, CLEF enables more accurate and scalable single-lead ECG analysis, advancing remote health monitoring. Code and pretrained CLEF models are available at: github.com/Nokia-Bell-Labs/ecg-foundation-model.

CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models

TL;DR

CLEF introduces a clinically-guided contrastive pretraining approach for ECG foundation models by leveraging SCORE2-based risk scores to adaptively weight negative pairs and by aligning embedding dissimilarities with clinically meaningful risk differences. The method handles missing metadata and demonstrates robust improvements over strong self-supervised baselines across multiple downstream tasks and datasets, achieving competitive performance with supervised ECGFounder when pretraining leads align. This enables more accurate, scalable single-lead ECG analysis using unlabeled data with readily available metadata, advancing remote health monitoring. The work also provides extensive ablations and establishes a framework for incorporating domain knowledge into contrastive learning for biomedical signals.

Abstract

The electrocardiogram (ECG) is a key diagnostic tool in cardiovascular health. Single-lead ECG recording is integrated into both clinical-grade and consumer wearables. While self-supervised pretraining of foundation models on unlabeled ECGs improves diagnostic performance, existing approaches do not incorporate domain knowledge from clinical metadata. We introduce a novel contrastive learning approach that utilizes an established clinical risk score to adaptively weight negative pairs: clinically-guided contrastive learning. It aligns the similarities of ECG embeddings with clinically meaningful differences between subjects, with an explicit mechanism to handle missing metadata. On 12-lead ECGs from 161K patients in the MIMIC-IV dataset, we pretrain single-lead ECG foundation models at three scales, collectively called CLEF, using only routinely collected metadata without requiring per-sample ECG annotations. We evaluate CLEF on 18 clinical classification and regression tasks across 7 held-out datasets, and benchmark against 5 foundation model baselines and 3 self-supervised algorithms. When pretrained on 12-lead ECG data and tested on lead-I data, CLEF outperforms self-supervised foundation model baselines: the medium-sized CLEF achieves average AUROC improvements of at least 2.6% in classification and average reductions in MAEs of at least 3.2% in regression. Comparing with existing self-supervised learning algorithms, CLEF improves the average AUROC by at least 1.8%. Moreover, when pretrained only on lead-I data for classification tasks, CLEF performs comparably to the state-of-the-art ECGFounder, which was trained in a supervised manner. Overall, CLEF enables more accurate and scalable single-lead ECG analysis, advancing remote health monitoring. Code and pretrained CLEF models are available at: github.com/Nokia-Bell-Labs/ecg-foundation-model.

Paper Structure

This paper contains 36 sections, 10 equations, 6 figures, 23 tables, 1 algorithm.

Figures (6)

  • Figure 1: CLEF's framework and performance overview (see §\ref{['sec:method']} for notations). (A) Our clinically-guided contrastive pretraining. Key components include a negative weighting loss $\mathcal{L}^w$ and a dissimilarity alignment loss $\mathcal{L}^d$ that work in tandem to guide contrastive learning with clinical knowledge. (B) Spider plot on AUROC performance of CLEF-M (our medium-sized model) across $13$ downstream classification tasks. Baseline performances are in gray lines (see §\ref{['sec:results']}).
  • Figure 2: AUROC scores from linear probing on $9$ classification tasks, comparing Moirai, Moment, ST-MEM, KED, and our CLEF. Each subplot focuses on one model, with others shown in gray for reference. Higher values indicate better performance (see further details in \ref{['tab:linear_single_lead_auroc']}).
  • Figure 3: Changes in AUROC of KED after further training with CLEF objectives across downstream ECG tasks.
  • Figure 4: Spider plot comparing CLEF with CLEF model pretrained on a specific lead (CLEF$^\texttt{I}$ and CLEF$^\texttt{II}$). AUROC is reported across $3$ model variants and $7$ downstream tasks.
  • Figure S1: Physiological noise for single-lead ECG contrastive learning. We include four common sources of signal degradation in wearable ECG devices: electrode movement artifacts, baseline wander, motion-induced distortions, and additive noise. Original signals are shown in black, with augmented versions in color.
  • ...and 1 more figures