ProtoECGNet: Case-Based Interpretable Deep Learning for Multi-Label ECG Classification with Contrastive Learning
Sahil Sethi, David Chen, Thomas Statchen, Michael C. Burkhart, Nipun Bhandari, Bashar Ramadan, Brett Beaulieu-Jones
TL;DR
ProtoECGNet introduces a self-explaining, prototype-based framework for multi-label ECG classification by deploying three specialized prototype branches that mirror clinical reasoning: rhythm (1D global prototypes), morphology (2D time-localized prototypes), and global abnormalities (2D global prototypes). A novel contrastive prototype loss, along with clustering, separation, and orthogonality terms, shapes the latent prototype space to reflect realistic label co-occurrence while maintaining discriminability. On PTB-XL’s 71-label benchmark, ProtoECGNet achieves competitive macro- and weighted-AUROC scores and provides faithful, case-based explanations validated by clinician ratings of prototype representativeness and clarity. The work demonstrates that prototype learning can scale to complex time-series, multi-label medical tasks and offers a practical path toward trustworthy AI-assisted clinical decision support through grounded, interpretable reasoning.
Abstract
Deep learning-based electrocardiogram (ECG) classification has shown impressive performance but clinical adoption has been slowed by the lack of transparent and faithful explanations. Post hoc methods such as saliency maps may fail to reflect a model's true decision process. Prototype-based reasoning offers a more transparent alternative by grounding decisions in similarity to learned representations of real ECG segments, enabling faithful, case-based explanations. We introduce ProtoECGNet, a prototype-based deep learning model for interpretable, multi-label ECG classification. ProtoECGNet employs a structured, multi-branch architecture that reflects clinical interpretation workflows: it integrates a 1D CNN with global prototypes for rhythm classification, a 2D CNN with time-localized prototypes for morphology-based reasoning, and a 2D CNN with global prototypes for diffuse abnormalities. Each branch is trained with a prototype loss designed for multi-label learning, combining clustering, separation, diversity, and a novel contrastive loss that encourages appropriate separation between prototypes of unrelated classes while allowing clustering for frequently co-occurring diagnoses. We evaluate ProtoECGNet on all 71 diagnostic labels from the PTB-XL dataset, demonstrating competitive performance relative to state-of-the-art black-box models while providing structured, case-based explanations. To assess prototype quality, we conduct a structured clinician review of the final model's projected prototypes, finding that they are rated as representative and clear. ProtoECGNet shows that prototype learning can be effectively scaled to complex, multi-label time-series classification, offering a practical path toward transparent and trustworthy deep learning models for clinical decision support.
