Table of Contents
Fetching ...

Echo2ECG: Enhancing ECG Representations with Cardiac Morphology from Multi-View Echos

Michelle Espranita Liman, Özgün Turgut, Alexander Müller, Eimo Martens, Daniel Rueckert, Philip Müller

TL;DR

Echo2ECG is proposed, a multimodal self-supervised learning framework that enriches ECG representations with the heart's morphological structure captured in multi-view Echos, and is evaluated as an ECG feature extractor on two clinically relevant tasks that fundamentally require morphological information.

Abstract

Electrocardiography (ECG) is a low-cost, widely used modality for diagnosing electrical abnormalities like atrial fibrillation by capturing the heart's electrical activity. However, it cannot directly measure cardiac morphological phenotypes, such as left ventricular ejection fraction (LVEF), which typically require echocardiography (Echo). Predicting these phenotypes from ECG would enable early, accessible health screening. Existing self-supervised methods suffer from a representational mismatch by aligning ECGs to single-view Echos, which only capture local, spatially restricted anatomical snapshots. To address this, we propose Echo2ECG, a multimodal self-supervised learning framework that enriches ECG representations with the heart's morphological structure captured in multi-view Echos. We evaluate Echo2ECG as an ECG feature extractor on two clinically relevant tasks that fundamentally require morphological information: (1) classification of structural cardiac phenotypes across three datasets, and (2) retrieval of Echo studies with similar morphological characteristics using ECG queries. Our extracted ECG representations consistently outperform those of state-of-the-art unimodal and multimodal baselines across both tasks, despite being 18x smaller than the largest baseline. These results demonstrate that Echo2ECG is a robust, powerful ECG feature extractor. Our code is accessible at https://github.com/michelleespranita/Echo2ECG.

Echo2ECG: Enhancing ECG Representations with Cardiac Morphology from Multi-View Echos

TL;DR

Echo2ECG is proposed, a multimodal self-supervised learning framework that enriches ECG representations with the heart's morphological structure captured in multi-view Echos, and is evaluated as an ECG feature extractor on two clinically relevant tasks that fundamentally require morphological information.

Abstract

Electrocardiography (ECG) is a low-cost, widely used modality for diagnosing electrical abnormalities like atrial fibrillation by capturing the heart's electrical activity. However, it cannot directly measure cardiac morphological phenotypes, such as left ventricular ejection fraction (LVEF), which typically require echocardiography (Echo). Predicting these phenotypes from ECG would enable early, accessible health screening. Existing self-supervised methods suffer from a representational mismatch by aligning ECGs to single-view Echos, which only capture local, spatially restricted anatomical snapshots. To address this, we propose Echo2ECG, a multimodal self-supervised learning framework that enriches ECG representations with the heart's morphological structure captured in multi-view Echos. We evaluate Echo2ECG as an ECG feature extractor on two clinically relevant tasks that fundamentally require morphological information: (1) classification of structural cardiac phenotypes across three datasets, and (2) retrieval of Echo studies with similar morphological characteristics using ECG queries. Our extracted ECG representations consistently outperform those of state-of-the-art unimodal and multimodal baselines across both tasks, despite being 18x smaller than the largest baseline. These results demonstrate that Echo2ECG is a robust, powerful ECG feature extractor. Our code is accessible at https://github.com/michelleespranita/Echo2ECG.
Paper Structure (17 sections, 1 equation, 2 figures, 5 tables)

This paper contains 17 sections, 1 equation, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Overview of Echo2ECG. To address the representational mismatch in prior methods aligning ECGs with single-view Echos, our framework uses contrastive learning to align ECGs with multi-view Echo studies. This multi-view alignment distills comprehensive cardiac morphological information from Echos into ECG representations. We aggregate view-level Echo embeddings from a frozen, powerful Echo encoder into a study-level Echo embedding. During pre-training, we optimize only the ECG encoder, the Echo view aggregator, and modality-specific projection layers. The ECG encoder is then used to extract ECG representations for subsequent downstream tasks.
  • Figure 2: AUROC ($\uparrow$) for ECG-based SHD classification. (a) A kNN classifier trained on just 1% of data using ECG representations extracted by Echo2ECG outperforms most models trained on 100% of the data, meaning that Echo2ECG provides robust ECG features that work well under low-data regimes. (b) Echo2ECG outperforms EchoingECG at $0.1\%$ training data, despite being 18$\times$ smaller.