Table of Contents
Fetching ...

Evaluating Feature Attribution Methods for Electrocardiogram

Jangwon Suh, Jimyeong Kim, Euna Jung, Wonjong Rhee

TL;DR

This work identifies and customize three evaluation metrics for feature attribution methods based on the characteristics of ECG: localization score, pointing game, and degradation score, and finds that some of the feature attributions are much more adequate for explaining ECG, where Grad-CAM outperforms the second-best method by a large margin.

Abstract

The performance of cardiac arrhythmia detection with electrocardiograms(ECGs) has been considerably improved since the introduction of deep learning models. In practice, the high performance alone is not sufficient and a proper explanation is also required. Recently, researchers have started adopting feature attribution methods to address this requirement, but it has been unclear which of the methods are appropriate for ECG. In this work, we identify and customize three evaluation metrics for feature attribution methods based on the characteristics of ECG: localization score, pointing game, and degradation score. Using the three evaluation metrics, we evaluate and analyze eleven widely-used feature attribution methods. We find that some of the feature attribution methods are much more adequate for explaining ECG, where Grad-CAM outperforms the second-best method by a large margin.

Evaluating Feature Attribution Methods for Electrocardiogram

TL;DR

This work identifies and customize three evaluation metrics for feature attribution methods based on the characteristics of ECG: localization score, pointing game, and degradation score, and finds that some of the feature attributions are much more adequate for explaining ECG, where Grad-CAM outperforms the second-best method by a large margin.

Abstract

The performance of cardiac arrhythmia detection with electrocardiograms(ECGs) has been considerably improved since the introduction of deep learning models. In practice, the high performance alone is not sufficient and a proper explanation is also required. Recently, researchers have started adopting feature attribution methods to address this requirement, but it has been unclear which of the methods are appropriate for ECG. In this work, we identify and customize three evaluation metrics for feature attribution methods based on the characteristics of ECG: localization score, pointing game, and degradation score. Using the three evaluation metrics, we evaluate and analyze eleven widely-used feature attribution methods. We find that some of the feature attribution methods are much more adequate for explaining ECG, where Grad-CAM outperforms the second-best method by a large margin.
Paper Structure (13 sections, 2 equations, 3 figures, 2 tables)

This paper contains 13 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overview of our proposed three evaluation metrics. ECG signals (blue line) of PVC are visualized with Grad-CAM attributions (green line). As the localization score and the pointing game need information on where abnormal beats are, we additionally mark PVC beats in each example.
  • Figure 2: Illustration of constructing the label of an ECG example that consists of multiple beats. An example consisting of all normal beats is labeled as normal; An example including any PVC beats is labeled as PVC.
  • Figure 3: Visualization of ECG signals (blue line) with Grad-CAM, Integrated Gradients, and KernelSHAP attributions (green line). Grad-CAM clearly shows a strong relationship with the ground-truth beats of PAC and PVC.