Table of Contents
Fetching ...

CardioCaps: Attention-based Capsule Network for Class-Imbalanced Echocardiogram Classification

Hyunkyung Han, Jihyeon Seong, Jaesik Choi

TL;DR

The paper tackles class-imbalanced echocardiogram classification by extending DR-CapsNets with CardioCaps, an attention-based routing architecture. It introduces a weighted margin loss paired with an EF-regression auxiliary loss to address label imbalance and uses an attention mechanism instead of iterative dynamic routing for training efficiency. On EchoNet-LVH, CardioCaps outperforms machine-learning baselines (Logistic Regression, Random Forest, XGBoost) and deep-learning baselines (CNNs, ResNet, U-Net, ViT) as well as advanced CapsNets (EM-CapsNets, Efficient-CapsNets), achieving high accuracy and robust precision in imbalanced settings. An ablation study confirms the importance of the shared affine matrix for translation equivariance and demonstrates the practical impact of CardioCaps for multi-angle echocardiogram analysis in clinical settings.

Abstract

Capsule Neural Networks (CapsNets) is a novel architecture that utilizes vector-wise representations formed by multiple neurons. Specifically, the Dynamic Routing CapsNets (DR-CapsNets) employ an affine matrix and dynamic routing mechanism to train capsules and acquire translation-equivariance properties, enhancing its robustness compared to traditional Convolutional Neural Networks (CNNs). Echocardiograms, which capture moving images of the heart, present unique challenges for traditional image classification methods. In this paper, we explore the potential of DR-CapsNets and propose CardioCaps, a novel attention-based DR-CapsNet architecture for class-imbalanced echocardiogram classification. CardioCaps comprises two key components: a weighted margin loss incorporating a regression auxiliary loss and an attention mechanism. First, the weighted margin loss prioritizes positive cases, supplemented by an auxiliary loss function based on the Ejection Fraction (EF) regression task, a crucial measure of cardiac function. This approach enhances the model's resilience in the face of class imbalance. Second, recognizing the quadratic complexity of dynamic routing leading to training inefficiencies, we adopt the attention mechanism as a more computationally efficient alternative. Our results demonstrate that CardioCaps surpasses traditional machine learning baseline methods, including Logistic Regression, Random Forest, and XGBoost with sampling methods and a class weight matrix. Furthermore, CardioCaps outperforms other deep learning baseline methods such as CNNs, ResNets, U-Nets, and ViTs, as well as advanced CapsNets methods such as EM-CapsNets and Efficient-CapsNets. Notably, our model demonstrates robustness to class imbalance, achieving high precision even in datasets with a substantial proportion of negative cases.

CardioCaps: Attention-based Capsule Network for Class-Imbalanced Echocardiogram Classification

TL;DR

The paper tackles class-imbalanced echocardiogram classification by extending DR-CapsNets with CardioCaps, an attention-based routing architecture. It introduces a weighted margin loss paired with an EF-regression auxiliary loss to address label imbalance and uses an attention mechanism instead of iterative dynamic routing for training efficiency. On EchoNet-LVH, CardioCaps outperforms machine-learning baselines (Logistic Regression, Random Forest, XGBoost) and deep-learning baselines (CNNs, ResNet, U-Net, ViT) as well as advanced CapsNets (EM-CapsNets, Efficient-CapsNets), achieving high accuracy and robust precision in imbalanced settings. An ablation study confirms the importance of the shared affine matrix for translation equivariance and demonstrates the practical impact of CardioCaps for multi-angle echocardiogram analysis in clinical settings.

Abstract

Capsule Neural Networks (CapsNets) is a novel architecture that utilizes vector-wise representations formed by multiple neurons. Specifically, the Dynamic Routing CapsNets (DR-CapsNets) employ an affine matrix and dynamic routing mechanism to train capsules and acquire translation-equivariance properties, enhancing its robustness compared to traditional Convolutional Neural Networks (CNNs). Echocardiograms, which capture moving images of the heart, present unique challenges for traditional image classification methods. In this paper, we explore the potential of DR-CapsNets and propose CardioCaps, a novel attention-based DR-CapsNet architecture for class-imbalanced echocardiogram classification. CardioCaps comprises two key components: a weighted margin loss incorporating a regression auxiliary loss and an attention mechanism. First, the weighted margin loss prioritizes positive cases, supplemented by an auxiliary loss function based on the Ejection Fraction (EF) regression task, a crucial measure of cardiac function. This approach enhances the model's resilience in the face of class imbalance. Second, recognizing the quadratic complexity of dynamic routing leading to training inefficiencies, we adopt the attention mechanism as a more computationally efficient alternative. Our results demonstrate that CardioCaps surpasses traditional machine learning baseline methods, including Logistic Regression, Random Forest, and XGBoost with sampling methods and a class weight matrix. Furthermore, CardioCaps outperforms other deep learning baseline methods such as CNNs, ResNets, U-Nets, and ViTs, as well as advanced CapsNets methods such as EM-CapsNets and Efficient-CapsNets. Notably, our model demonstrates robustness to class imbalance, achieving high precision even in datasets with a substantial proportion of negative cases.
Paper Structure (27 sections, 6 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 27 sections, 6 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Samples of echocardiogram datasets. As shown in the figure, the heart is in motion, and its shape deviates slightly from the one confirmed by the doctor. Our study demonstrates that the DR-CapsNets, acting as a translation-equivariance learning model, can achieve optimal performance in diagnosing echocardiograms.
  • Figure 2: CardioCaps: attention-based DR-CapsNets. The diagram illustrates the architecture of the proposed attention-based DR-CapsNets, named CardioCaps, designed for class-imbalanced echocardiogram classification. The network comprises five components: ReLU Conv, Primary Capsules, Affine Transformation, Attention, and FC Layer Decoder. First, the ReLU Conv extracts features using a large kernel size. Subsequently, the Primary Capsule layer transforms feature neurons into capsules with a $d_{in cap}$ dimension. Third, the Affine Transformation matrix is applied to the capsules to ensure robust transformations. Fourth, we utilize the attention mechanism instead of dynamic routing for enhanced training efficiency. Finally, the normalized digit capsules are fed into the FC decoder to obtain reconstruction outputs. Note that $d$ represents the dimension of capsules, and $C$ denotes the unique class number.
  • Figure 3: Empirical study on loss function.
  • Figure 4: Empirical study on dynamic routing and attention.
  • Figure 5: Empirical study on affine matrix.
  • ...and 1 more figures