Table of Contents
Fetching ...

ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in Echocardiography

Yeganeh Ghamary, Victoria Wu, Hooman Vaseli, Christina Luong, Teresa Tsang, Siavash Bigdeli, Purang Abolmaesumi

TL;DR

EF estimation from echocardiography suffers from inter-observer variability and black-box AI models that lack trust. ProtoEFNet presents an inherently interpretable, video-based prototype learning framework for continuous EF regression, using dynamic spatiotemporal prototypes and a Prototype Angular Separation loss to enforce ordinal structure in the latent space. It achieves competitive accuracy on EchoNet-Dynamic while providing clinically meaningful explanations via prototype contributions and LV-focused activation maps, outperforming many explainable baselines. Ablation confirms the necessity of all loss terms, especially PAS, for both prediction quality and interpretable, ordinal prototype organization, suggesting strong potential for clinical adoption with transparent reasoning.

Abstract

Ejection fraction (EF) is a crucial metric for assessing cardiac function and diagnosing conditions such as heart failure. Traditionally, EF estimation requires manual tracing and domain expertise, making the process time-consuming and subject to interobserver variability. Most current deep learning methods for EF prediction are black-box models with limited transparency, which reduces clinical trust. Some post-hoc explainability methods have been proposed to interpret the decision-making process after the prediction is made. However, these explanations do not guide the model's internal reasoning and therefore offer limited reliability in clinical applications. To address this, we introduce ProtoEFNet, a novel video-based prototype learning model for continuous EF regression. The model learns dynamic spatiotemporal prototypes that capture clinically meaningful cardiac motion patterns. Additionally, the proposed Prototype Angular Separation (PAS) loss enforces discriminative representations across the continuous EF spectrum. Our experiments on the EchonetDynamic dataset show that ProtoEFNet can achieve accuracy on par with its non-interpretable counterpart while providing clinically relevant insight. The ablation study shows that the proposed loss boosts performance with a 2% increase in F1 score from 77.67$\pm$2.68 to 79.64$\pm$2.10. Our source code is available at: https://github.com/DeepRCL/ProtoEF

ProtoEFNet: Dynamic Prototype Learning for Inherently Interpretable Ejection Fraction Estimation in Echocardiography

TL;DR

EF estimation from echocardiography suffers from inter-observer variability and black-box AI models that lack trust. ProtoEFNet presents an inherently interpretable, video-based prototype learning framework for continuous EF regression, using dynamic spatiotemporal prototypes and a Prototype Angular Separation loss to enforce ordinal structure in the latent space. It achieves competitive accuracy on EchoNet-Dynamic while providing clinically meaningful explanations via prototype contributions and LV-focused activation maps, outperforming many explainable baselines. Ablation confirms the necessity of all loss terms, especially PAS, for both prediction quality and interpretable, ordinal prototype organization, suggesting strong potential for clinical adoption with transparent reasoning.

Abstract

Ejection fraction (EF) is a crucial metric for assessing cardiac function and diagnosing conditions such as heart failure. Traditionally, EF estimation requires manual tracing and domain expertise, making the process time-consuming and subject to interobserver variability. Most current deep learning methods for EF prediction are black-box models with limited transparency, which reduces clinical trust. Some post-hoc explainability methods have been proposed to interpret the decision-making process after the prediction is made. However, these explanations do not guide the model's internal reasoning and therefore offer limited reliability in clinical applications. To address this, we introduce ProtoEFNet, a novel video-based prototype learning model for continuous EF regression. The model learns dynamic spatiotemporal prototypes that capture clinically meaningful cardiac motion patterns. Additionally, the proposed Prototype Angular Separation (PAS) loss enforces discriminative representations across the continuous EF spectrum. Our experiments on the EchonetDynamic dataset show that ProtoEFNet can achieve accuracy on par with its non-interpretable counterpart while providing clinically relevant insight. The ablation study shows that the proposed loss boosts performance with a 2% increase in F1 score from 77.672.68 to 79.642.10. Our source code is available at: https://github.com/DeepRCL/ProtoEF

Paper Structure

This paper contains 16 sections, 7 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: (a) An overview of the architecture of ProtoEFNet. The feature extractor uses an ROI module to focus on clinically relevant spatio-temporal regions, and the final prediction is the weighted sum of the prototype labels, (b) Grad-CAM attention of CoReEcho maani2024coreecho, and the decision process of ProtoEFNet. The activation maps on the input data and prototypes show where the model "looks at" when calculating the cosine similarity, (c) The Prototype Angular Separation (PAS) loss increases separation between prototypes with different EF ranges, and prototypes are projected to/replaced with the closest training sample within its EF range.
  • Figure 2: Grad-CAM selvaraju2017grad of CoReEcho (bottom row) and the activation map of ProtoEFNet (top row). ProtoEFNet is localised on LV wall motion and mitral valve movements during systole (contraction).
  • Figure 3: Activation maps of a test case and the top contributing prototype. The model captures periodic patterns and aligns spatio-temporal features. In this example, it assigns a high similarity score as it "looks at" the LV wall during systole in the input and identifies that it "looks like" the LV wall of the prototype during the same phase.
  • Figure 4: Ablation Study of different loss components on validation set. Standard deviation is calculated across 5 repetitions of each experiment.
  • Figure 5: The PCA plots of the prototypes (circles) and the top 100 closest latent features of the validation set to each prototype (stars). The colors indicate the ground truth EF values.