Table of Contents
Fetching ...

MedalCare-XL: 16,900 healthy and pathological 12 lead ECGs obtained through electrophysiological simulations

Karli Gillette, Matthias A. F. Gsell, Claudia Nagel, Jule Bender, Bejamin Winkler, Steven E. Williams, Markus Bär, Tobias Schäffter, Olaf Dössel, Gernot Plank, Axel Loewe

TL;DR

A comparison of extracted features between the virtual cohort and a publicly available clinical ECG database demonstrated that the synthetic signals represent clinical ECGs for healthy and pathological subpopulations with high fidelity.

Abstract

Mechanistic cardiac electrophysiology models allow for personalized simulations of the electrical activity in the heart and the ensuing electrocardiogram (ECG) on the body surface. As such, synthetic signals possess known ground truth labels of the underlying disease and can be employed for validation of machine learning ECG analysis tools in addition to clinical signals. Recently, synthetic ECGs were used to enrich sparse clinical data or even replace them completely during training leading to improved performance on real-world clinical test data. We thus generated a novel synthetic database comprising a total of 16,900 12 lead ECGs based on electrophysiological simulations equally distributed into healthy control and 7 pathology classes. The pathological case of myocardial infraction had 6 sub-classes. A comparison of extracted features between the virtual cohort and a publicly available clinical ECG database demonstrated that the synthetic signals represent clinical ECGs for healthy and pathological subpopulations with high fidelity. The ECG database is split into training, validation, and test folds for development and objective assessment of novel machine learning algorithms.

MedalCare-XL: 16,900 healthy and pathological 12 lead ECGs obtained through electrophysiological simulations

TL;DR

A comparison of extracted features between the virtual cohort and a publicly available clinical ECG database demonstrated that the synthetic signals represent clinical ECGs for healthy and pathological subpopulations with high fidelity.

Abstract

Mechanistic cardiac electrophysiology models allow for personalized simulations of the electrical activity in the heart and the ensuing electrocardiogram (ECG) on the body surface. As such, synthetic signals possess known ground truth labels of the underlying disease and can be employed for validation of machine learning ECG analysis tools in addition to clinical signals. Recently, synthetic ECGs were used to enrich sparse clinical data or even replace them completely during training leading to improved performance on real-world clinical test data. We thus generated a novel synthetic database comprising a total of 16,900 12 lead ECGs based on electrophysiological simulations equally distributed into healthy control and 7 pathology classes. The pathological case of myocardial infraction had 6 sub-classes. A comparison of extracted features between the virtual cohort and a publicly available clinical ECG database demonstrated that the synthetic signals represent clinical ECGs for healthy and pathological subpopulations with high fidelity. The ECG database is split into training, validation, and test folds for development and objective assessment of novel machine learning algorithms.
Paper Structure (25 sections, 4 equations, 8 figures, 4 tables)

This paper contains 25 sections, 4 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Pipeline for the generation and validation of the synthetic 12 lead ECG database using individual multi-scale models of the atria and the ventricles.
  • Figure 2: Cohort of ventricular-torso models derived from clinical MRIs. Tissues include lungs, blood pools, atrial tissue, ventricles, and general torso. Parameters dictating ventricular electrophysiologyfor normal healthy control were varied through physiological ranges. Disease conditions of BBB and MI were then modeled by making adaptions to the model.
  • Figure 3: Anatomical model cohort for atrial simulations. 80 atrial geometries with physiological left and right atrial volumes were derived from a bi-atrial statistical shape model Nagel-2021-ID16581 and served as a basis for normal healthy control simulations. 9 different volume fractions of these models were additionally replaced by fibrosis for simulations of fibrotic atrial cardiomyopathy. Interatrial conduction block signals were generated by blocking conduction in Bachmann's Bundle in the same 80 geometries. Furthermore, 45 geometries with enlarged left atrial volumes were generated. As for the torso anatomy, 25 geometries were derived from a human body statistical shape model to account for height, weight and gender differences in the virtual patient cohort. Moreover, the rotation angle as well as the spatial position of the atria inside the torso were varied in physiological ranges.
  • Figure 4: (A) Exemplary 10 $s$ ECGs (lead II) of each pathology class and a normal healthy control in the virtual cohort. (B) Exemplary 10 $s$ ECGs (lead II) of each MI pathology class for different occlusion sites and degrees of transmurality.
  • Figure 5: Comparison of features in the healthy clinical and virtual cohort. Probability density functions are shown for timing features (left column, from top to bottom: P wave duration, QRS duration, T wave duration, PR interval, QTinterval, RR interval) and amplitude features (right column, from top to bottom: P wave amplitude, Q / R / S peak amplitude, T wave amplitude). Blue and red curves represent the distributions calculated based on the clinical and the simulated data, respectively.
  • ...and 3 more figures