RawECGNet: Deep Learning Generalization for Atrial Fibrillation Detection from the Raw ECG
Noam Ben-Moshe, Kenta Tsutsui, Shany Biton, Leif Sörnmo, Joachim A. Behar
TL;DR
RawECGNet introduces a two-stage, morphology-aware deep learning architecture to detect AF and AFl from raw single-lead ECG, achieving superior generalization across leads and external cohorts compared with a rhythm-based RR-interval model. By integrating a ResNet-based encoder with domain-shift uncertainty and a BiGRU for temporal context, trained on multiple leads, the model robustly handles distribution shifts due to geography, lead position, and demographics. Extensive evaluation on UVAF, RBDB, and SHDB demonstrates improved per-window F1 scores and substantially lower AF burden estimation error, with detailed ablation and error analyses identifying key drivers of performance and residual challenges. The work underscores the value of morphology information and cross-lead training for robust AF/AFl detection, and points to future directions including 12-lead data and ECG foundation models to further enhance generalization in diverse real-world settings.
Abstract
Introduction: Deep learning models for detecting episodes of atrial fibrillation (AF) using rhythm information in long-term, ambulatory ECG recordings have shown high performance. However, the rhythm-based approach does not take advantage of the morphological information conveyed by the different ECG waveforms, particularly the f-waves. As a result, the performance of such models may be inherently limited. Methods: To address this limitation, we have developed a deep learning model, named RawECGNet, to detect episodes of AF and atrial flutter (AFl) using the raw, single-lead ECG. We compare the generalization performance of RawECGNet on two external data sets that account for distribution shifts in geography, ethnicity, and lead position. RawECGNet is further benchmarked against a state-of-the-art deep learning model, named ArNet2, which utilizes rhythm information as input. Results: Using RawECGNet, the results for the different leads in the external test sets in terms of the F1 score were 0.91--0.94 in RBDB and 0.93 in SHDB, compared to 0.89--0.91 in RBDB and 0.91 in SHDB for ArNet2. The results highlight RawECGNet as a high-performance, generalizable algorithm for detection of AF and AFl episodes, exploiting information on both rhythm and morphology.
