CardioCaps: Attention-based Capsule Network for Class-Imbalanced Echocardiogram Classification
Hyunkyung Han, Jihyeon Seong, Jaesik Choi
TL;DR
The paper tackles class-imbalanced echocardiogram classification by extending DR-CapsNets with CardioCaps, an attention-based routing architecture. It introduces a weighted margin loss paired with an EF-regression auxiliary loss to address label imbalance and uses an attention mechanism instead of iterative dynamic routing for training efficiency. On EchoNet-LVH, CardioCaps outperforms machine-learning baselines (Logistic Regression, Random Forest, XGBoost) and deep-learning baselines (CNNs, ResNet, U-Net, ViT) as well as advanced CapsNets (EM-CapsNets, Efficient-CapsNets), achieving high accuracy and robust precision in imbalanced settings. An ablation study confirms the importance of the shared affine matrix for translation equivariance and demonstrates the practical impact of CardioCaps for multi-angle echocardiogram analysis in clinical settings.
Abstract
Capsule Neural Networks (CapsNets) is a novel architecture that utilizes vector-wise representations formed by multiple neurons. Specifically, the Dynamic Routing CapsNets (DR-CapsNets) employ an affine matrix and dynamic routing mechanism to train capsules and acquire translation-equivariance properties, enhancing its robustness compared to traditional Convolutional Neural Networks (CNNs). Echocardiograms, which capture moving images of the heart, present unique challenges for traditional image classification methods. In this paper, we explore the potential of DR-CapsNets and propose CardioCaps, a novel attention-based DR-CapsNet architecture for class-imbalanced echocardiogram classification. CardioCaps comprises two key components: a weighted margin loss incorporating a regression auxiliary loss and an attention mechanism. First, the weighted margin loss prioritizes positive cases, supplemented by an auxiliary loss function based on the Ejection Fraction (EF) regression task, a crucial measure of cardiac function. This approach enhances the model's resilience in the face of class imbalance. Second, recognizing the quadratic complexity of dynamic routing leading to training inefficiencies, we adopt the attention mechanism as a more computationally efficient alternative. Our results demonstrate that CardioCaps surpasses traditional machine learning baseline methods, including Logistic Regression, Random Forest, and XGBoost with sampling methods and a class weight matrix. Furthermore, CardioCaps outperforms other deep learning baseline methods such as CNNs, ResNets, U-Nets, and ViTs, as well as advanced CapsNets methods such as EM-CapsNets and Efficient-CapsNets. Notably, our model demonstrates robustness to class imbalance, achieving high precision even in datasets with a substantial proportion of negative cases.
