SSSD-ECG-nle: New Label Embeddings with Structured State-Space Models for ECG generation
Sergey Skorik, Aram Avetisyan
TL;DR
This work tackles privacy concerns in ECG sharing by generating high-fidelity synthetic digital 12-lead ECGs using diffusion with Structured State Space Models. It introduces SSSD-ECG-nle, a conditioned diffusion generator that employs new label embeddings to better capture neutral and abnormal states, and rigorously evaluates via both quantitative metrics (TSTR, TRTS, G-mean, F1, ROC-AUC) and expert physician assessments. The study demonstrates advantages in downstream task performance and convergence behavior while highlighting a gap between quantitative realism and human indistinguishability, suggesting directions for improving synthetic ECG realism. The authors also provide synthetic data and code to promote reproducibility and further research in privacy-preserving ECG synthesis.
Abstract
An electrocardiogram (ECG) is vital for identifying cardiac diseases, offering crucial insights for diagnosing heart conditions and informing potentially life-saving treatments. However, like other types of medical data, ECGs are subject to privacy concerns when distributed and analyzed. Diffusion models have made significant progress in recent years, creating the possibility for synthesizing data comparable to the real one and allowing their widespread adoption without privacy concerns. In this paper, we use diffusion models with structured state spaces for generating digital 10-second 12-lead ECG signals. We propose the SSSD-ECG-nle architecture based on SSSD-ECG with a modified conditioning mechanism and demonstrate its efficiency on downstream tasks. We conduct quantitative and qualitative evaluations, including analyzing convergence speed, the impact of adding positive samples, and assessment with physicians' expert knowledge. Finally, we share the results of physician evaluations and also make synthetic data available to ensure the reproducibility of the experiments described.
