NERULA: A Dual-Pathway Self-Supervised Learning Framework for Electrocardiogram Signal Analysis
Gouthamaan Manimaran, Sadasivan Puthusserypady, Helena Domínguez, Adrian Atienza, Jakob E. Bardram
TL;DR
NERULA addresses the challenge of learning robust ECG representations with limited labels by introducing a dual-pathway self-supervised framework that combines a non-contrastive discriminative path with a reconstruction-based generative path. It employs a 50% temporal masking strategy to generate two complementary views, $x_i = m \odot x$ and $x_j = (1-m) \odot x$, and uses a LocalLead encoder to enable efficient time-series processing. The two losses are weighted to form a unified objective $\mathcal{L} = \mathcal{L}_{nc} + 10 \mathcal{L}_{recon}$, with the non-contrastive term aligned via cosine similarity and the reconstruction term using a robust Huber loss $L_{\delta}$. Across pretraining on PhysioNet 2020 and evaluation on PhysioNet 2017, PTB-XL, and HAR benchmarks, NERULA outperforms state-of-the-art SSL methods such as BYOL, SimCLR, CLOCS, and Ti-MAE in downstream linear evaluations, demonstrating superior feature learning for arrhythmia classification, gender and age prediction, and activity recognition. The combination of a dual-pathway design, latent-space masking, and a robust reconstruction objective yields practical, data-efficient ECG representations with potential impact for wearable health monitoring, albeit with limitations related to hyperparameter choices and loss weighting that warrant further systematic optimization.
Abstract
Electrocardiogram (ECG) signals are critical for diagnosing heart conditions and capturing detailed cardiac patterns. As wearable single-lead ECG devices become more common, efficient analysis methods are essential. We present NERULA (Non-contrastive ECG and Reconstruction Unsupervised Learning Algorithm), a self-supervised framework designed for single-lead ECG signals. NERULA's dual-pathway architecture combines ECG reconstruction and non-contrastive learning to extract detailed cardiac features. Our 50% masking strategy, using both masked and inverse-masked signals, enhances model robustness against real-world incomplete or corrupted data. The non-contrastive pathway aligns representations of masked and inverse-masked signals, while the reconstruction pathway comprehends and reconstructs missing features. We show that combining generative and discriminative paths into the training spectrum leads to better results by outperforming state-of-the-art self-supervised learning benchmarks in various tasks, demonstrating superior performance in ECG analysis, including arrhythmia classification, gender classification, age regression, and human activity recognition. NERULA's dual-pathway design offers a robust, efficient solution for comprehensive ECG signal interpretation.
