TolerantECG: A Foundation Model for Imperfect Electrocardiogram
Huynh Dang Nguyen, Trong-Thang Pham, Ngan Le, Van Nguyen
TL;DR
TolerantECG tackles the challenge of diagnosing cardiac conditions from imperfect ECG data by learning robust multimodal representations that align ECG signals with detailed text reports. The framework combines Cardiac Feature Retrieval (CFR) to generate informative diagnostic descriptions and a dual-mode distillation scheme (DuoDistill) to handle lead-missing and noisy signals, with alternating training to reinforce robustness. Empirical results on PTB-XL and MIT-BIH show state-of-the-art or near state-of-the-art performance across varying conditions, highlighting strong transferability and resilience to common ECG artifacts. This work advances practical ECG analysis by enabling reliable interpretation with incomplete or degraded signals, reducing diagnostic uncertainty in real-world settings.
Abstract
The electrocardiogram (ECG) is an essential and effective tool for diagnosing heart diseases. However, its effectiveness can be compromised by noise or unavailability of one or more leads of the standard 12-lead recordings, resulting in diagnostic errors or uncertainty. To address these challenges, we propose TolerantECG, a foundation model for ECG signals that is robust to noise and capable of functioning with arbitrary subsets of the standard 12-lead ECG. TolerantECG training combines contrastive and self-supervised learning frameworks to jointly learn ECG signal representations alongside their corresponding knowledge-retrieval-based text report descriptions and corrupted or lead-missing signals. Comprehensive benchmarking results demonstrate that TolerantECG consistently ranks as the best or second-best performer across various ECG signal conditions and class levels in the PTB-XL dataset, and achieves the highest performance on the MIT-BIH Arrhythmia Database.
