Table of Contents
Fetching ...

Revisiting the Disequilibrium Issues in Tackling Heart Disease Classification Tasks

Thao Hoang, Linh Nguyen, Khoi Do, Duong Nguyen, Viet Dung Nguyen

TL;DR

The paper addresses ECG heart disease classification challenges arising from channel-wise magnitude imbalances across leads and long-tailed class distributions that bias DL learning. It introduces Channel-wise Magnitude Equalizer (CME) to normalize channel powers while preserving frequency characteristics, and Inverted Weight Logarithmic (IWL) loss to balance imbalanced data without added model complexity. On the CPSC2018 dataset, CME+IWL improves state-of-the-art models, with EfficientNetB0 plus CME+IWL achieving top results (accuracies in the low- to mid-80s and elevated F1-scores across imbalance levels). The approach is architecture-agnostic and provides a practical, effective means to enhance ECG classification in realistic, imbalanced data scenarios without designing new architectures.

Abstract

In the field of heart disease classification, two primary obstacles arise. Firstly, existing Electrocardiogram (ECG) datasets consistently demonstrate imbalances and biases across various modalities. Secondly, these time-series data consist of diverse lead signals, causing Convolutional Neural Networks (CNNs) to become overfitting to the one with higher power, hence diminishing the performance of the Deep Learning (DL) process. In addition, when facing an imbalanced dataset, performance from such high-dimensional data may be susceptible to overfitting. Current efforts predominantly focus on enhancing DL models by designing novel architectures, despite these evident challenges, seemingly overlooking the core issues, therefore hindering advancements in heart disease classification. To address these obstacles, our proposed approach introduces two straightforward and direct methods to enhance the classification tasks. To address the high dimensionality issue, we employ a Channel-wise Magnitude Equalizer (CME) on signal-encoded images. This approach reduces redundancy in the feature data range, highlighting changes in the dataset. Simultaneously, to counteract data imbalance, we propose the Inverted Weight Logarithmic Loss (IWL) to alleviate imbalances among the data. When applying IWL loss, the accuracy of state-of-the-art models (SOTA) increases up to 5% in the CPSC2018 dataset. CME in combination with IWL also surpasses the classification results of other baseline models from 5% to 10%.

Revisiting the Disequilibrium Issues in Tackling Heart Disease Classification Tasks

TL;DR

The paper addresses ECG heart disease classification challenges arising from channel-wise magnitude imbalances across leads and long-tailed class distributions that bias DL learning. It introduces Channel-wise Magnitude Equalizer (CME) to normalize channel powers while preserving frequency characteristics, and Inverted Weight Logarithmic (IWL) loss to balance imbalanced data without added model complexity. On the CPSC2018 dataset, CME+IWL improves state-of-the-art models, with EfficientNetB0 plus CME+IWL achieving top results (accuracies in the low- to mid-80s and elevated F1-scores across imbalance levels). The approach is architecture-agnostic and provides a practical, effective means to enhance ECG classification in realistic, imbalanced data scenarios without designing new architectures.

Abstract

In the field of heart disease classification, two primary obstacles arise. Firstly, existing Electrocardiogram (ECG) datasets consistently demonstrate imbalances and biases across various modalities. Secondly, these time-series data consist of diverse lead signals, causing Convolutional Neural Networks (CNNs) to become overfitting to the one with higher power, hence diminishing the performance of the Deep Learning (DL) process. In addition, when facing an imbalanced dataset, performance from such high-dimensional data may be susceptible to overfitting. Current efforts predominantly focus on enhancing DL models by designing novel architectures, despite these evident challenges, seemingly overlooking the core issues, therefore hindering advancements in heart disease classification. To address these obstacles, our proposed approach introduces two straightforward and direct methods to enhance the classification tasks. To address the high dimensionality issue, we employ a Channel-wise Magnitude Equalizer (CME) on signal-encoded images. This approach reduces redundancy in the feature data range, highlighting changes in the dataset. Simultaneously, to counteract data imbalance, we propose the Inverted Weight Logarithmic Loss (IWL) to alleviate imbalances among the data. When applying IWL loss, the accuracy of state-of-the-art models (SOTA) increases up to 5% in the CPSC2018 dataset. CME in combination with IWL also surpasses the classification results of other baseline models from 5% to 10%.
Paper Structure (13 sections, 8 equations, 2 figures, 2 tables)

This paper contains 13 sections, 8 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Methodology Overview: The input channel-wise ECG signal first goes through the CME process. Here, we normalize each channel-wise signal and assess its contribution (illustrated by different tones of green color) to the loss update by using an Attention-like layer. The $C$ channels are then squeezed and excited based on their relative importance, followed by an interpolation. The obtained images are then fed to the AI model. The original dataset has a long-tailed distribution. After each training epoch, the model gives prediction vectors, and together with true labels $y^i$, they go into the IWL loss function. Here, IWL loss performs gradient balancing and updates to the AI model to strengthen the weight from long-tailed classes and weaken the data from high-frequency classes.
  • Figure 2: Dataset illustration: a) The sampling distribution of 9 classes (total of 6400 single label samples): Right bundle branch block (RBBB), Atrial fibrillation (AF), Normal, ST-segment depression (STD), First-degree atrioventricular block (I-AVB), Premature ventricular contraction (PVC), Premature atrial contraction (PAC), ST-segment elevated (STE) and Left bundle branch block (LBBB). b) The ECG waveforms (Left bundle branch block disease, from A0011). There are 5 leads being illustrated.