Table of Contents
Fetching ...

A Multi-Resolution Mutual Learning Network for Multi-Label ECG Classification

Wei Huang, Ning Wang, Panpan Feng, Haiyan Wang, Zongmin Wang, Bing Zhou

TL;DR

The paper tackles multi-label ECG classification by capturing both local waveform details and global rhythm patterns. It introduces MRM-Net, a network that combines a dual-resolution attention mechanism with a feature complementary (mutual learning) module to fuse multi-scale information without losing important features. Empirical results on PTB-XL and CPSC2018 demonstrate that MRM-Net outperforms state-of-the-art methods across multiple tasks, with notable gains in rhythm and morphology classification. The approach offers a robust and practical framework for clinical ECG analysis through principled cross-resolution feature fusion and knowledge sharing.

Abstract

Electrocardiograms (ECG), which record the electrophysiological activity of the heart, have become a crucial tool for diagnosing these diseases. In recent years, the application of deep learning techniques has significantly improved the performance of ECG signal classification. Multi-resolution feature analysis, which captures and processes information at different time scales, can extract subtle changes and overall trends in ECG signals, showing unique advantages. However, common multi-resolution analysis methods based on simple feature addition or concatenation may lead to the neglect of low-resolution features, affecting model performance. To address this issue, this paper proposes the Multi-Resolution Mutual Learning Network (MRM-Net). MRM-Net includes a dual-resolution attention architecture and a feature complementary mechanism. The dual-resolution attention architecture processes high-resolution and low-resolution features in parallel. Through the attention mechanism, the high-resolution and low-resolution branches can focus on subtle waveform changes and overall rhythm patterns, enhancing the ability to capture critical features in ECG signals. Meanwhile, the feature complementary mechanism introduces mutual feature learning after each layer of the feature extractor. This allows features at different resolutions to reinforce each other, thereby reducing information loss and improving model performance and robustness. Experiments on the PTB-XL and CPSC2018 datasets demonstrate that MRM-Net significantly outperforms existing methods in multi-label ECG classification performance. The code for our framework will be publicly available at https://github.com/wxhdf/MRM.

A Multi-Resolution Mutual Learning Network for Multi-Label ECG Classification

TL;DR

The paper tackles multi-label ECG classification by capturing both local waveform details and global rhythm patterns. It introduces MRM-Net, a network that combines a dual-resolution attention mechanism with a feature complementary (mutual learning) module to fuse multi-scale information without losing important features. Empirical results on PTB-XL and CPSC2018 demonstrate that MRM-Net outperforms state-of-the-art methods across multiple tasks, with notable gains in rhythm and morphology classification. The approach offers a robust and practical framework for clinical ECG analysis through principled cross-resolution feature fusion and knowledge sharing.

Abstract

Electrocardiograms (ECG), which record the electrophysiological activity of the heart, have become a crucial tool for diagnosing these diseases. In recent years, the application of deep learning techniques has significantly improved the performance of ECG signal classification. Multi-resolution feature analysis, which captures and processes information at different time scales, can extract subtle changes and overall trends in ECG signals, showing unique advantages. However, common multi-resolution analysis methods based on simple feature addition or concatenation may lead to the neglect of low-resolution features, affecting model performance. To address this issue, this paper proposes the Multi-Resolution Mutual Learning Network (MRM-Net). MRM-Net includes a dual-resolution attention architecture and a feature complementary mechanism. The dual-resolution attention architecture processes high-resolution and low-resolution features in parallel. Through the attention mechanism, the high-resolution and low-resolution branches can focus on subtle waveform changes and overall rhythm patterns, enhancing the ability to capture critical features in ECG signals. Meanwhile, the feature complementary mechanism introduces mutual feature learning after each layer of the feature extractor. This allows features at different resolutions to reinforce each other, thereby reducing information loss and improving model performance and robustness. Experiments on the PTB-XL and CPSC2018 datasets demonstrate that MRM-Net significantly outperforms existing methods in multi-label ECG classification performance. The code for our framework will be publicly available at https://github.com/wxhdf/MRM.

Paper Structure

This paper contains 16 sections, 8 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: The overall architecture of MRM-Net comprises three main modules: the Multi-Scale Convolution Module (MS Module), the Dual-Resolution Attention Module, and the Feature Complementary Module. The Dual-Resolution Attention Module includes an Attention Fusion Layer (AF Layer) and a Fully Connected Classification Layer (FC Layer). The Feature Complementary Module contains three KLoss layers. Here, y represents the true labels, while Pre1 and Pre2 are the predictions from the low-resolution and high-resolution branches, respectively.
  • Figure 2: Schematic diagram of the Multi-Scale Convolution Module. ConvBlock consists of a convolution layer, a BatchNorm1d layer, a ReLU activation function, and a MaxPool1d layer. conv denotes the convolution layer, and $\oplus$ denotes element-wise addition.
  • Figure 3: (a) Illustration of the Attention Fusion Layer, which includes three BaseBlocks, an Attention Fusion layer, and a LayerNorm layer. (b) Detailed illustration of the Attention Fusion. In this diagram, BaseBlock represents a residual block, conv denotes the convolution layer, AvgPool and MaxPool represent global average pooling and global max pooling, respectively, $\oplus$ denotes element-wise addition, $\otimes$ denotes element-wise multiplication, and $\textcircled{c}$ denotes the concatenation operation.
  • Figure 4: Diagram illustrates the feature complementary module, where Conv denotes the convolution operation, Flattening represents the flattening operation, and KL is the KL-divergence, enabling mutual learning of features at different resolutions through imitation loss.
  • Figure 5: The confusion matrix of our method, with the horizontal axis representing the true classes and the vertical axis representing the classes predicted by our method. The values are normalized by the total number of true labels.
  • ...and 1 more figures