Table of Contents
Fetching ...

GAF-FusionNet: Multimodal ECG Analysis via Gramian Angular Fields and Split Attention

Jiahao Qin, Feng Liu

TL;DR

GAF-FusionNet tackles the challenge of accurate ECG classification by fusing time-series data with image-based representations obtained via Gramian Angular Fields. It introduces a dual-branch architecture (time-series CNN+BiLSTM and GAF image CNN) and a novel dual-layer cross-channel split attention module to adaptively weight intra- and inter-modality information, followed by MLP-based fusion and softmax classification. The method achieves state-of-the-art performance on ECG200, ECG5000, and MIT-BIH datasets and is validated through comprehensive ablations demonstrating the value of cross-modal fusion and the attention mechanism. While promising, the work notes computational demands and the need for broader clinical validation and interpretability, outlining future directions in those areas.

Abstract

Electrocardiogram (ECG) analysis plays a crucial role in diagnosing cardiovascular diseases, but accurate interpretation of these complex signals remains challenging. This paper introduces a novel multimodal framework(GAF-FusionNet) for ECG classification that integrates time-series analysis with image-based representation using Gramian Angular Fields (GAF). Our approach employs a dual-layer cross-channel split attention module to adaptively fuse temporal and spatial features, enabling nuanced integration of complementary information. We evaluate GAF-FusionNet on three diverse ECG datasets: ECG200, ECG5000, and the MIT-BIH Arrhythmia Database. Results demonstrate significant improvements over state-of-the-art methods, with our model achieving 94.5\%, 96.9\%, and 99.6\% accuracy on the respective datasets. Our code will soon be available at https://github.com/Cross-Innovation-Lab/GAF-FusionNet.git.

GAF-FusionNet: Multimodal ECG Analysis via Gramian Angular Fields and Split Attention

TL;DR

GAF-FusionNet tackles the challenge of accurate ECG classification by fusing time-series data with image-based representations obtained via Gramian Angular Fields. It introduces a dual-branch architecture (time-series CNN+BiLSTM and GAF image CNN) and a novel dual-layer cross-channel split attention module to adaptively weight intra- and inter-modality information, followed by MLP-based fusion and softmax classification. The method achieves state-of-the-art performance on ECG200, ECG5000, and MIT-BIH datasets and is validated through comprehensive ablations demonstrating the value of cross-modal fusion and the attention mechanism. While promising, the work notes computational demands and the need for broader clinical validation and interpretability, outlining future directions in those areas.

Abstract

Electrocardiogram (ECG) analysis plays a crucial role in diagnosing cardiovascular diseases, but accurate interpretation of these complex signals remains challenging. This paper introduces a novel multimodal framework(GAF-FusionNet) for ECG classification that integrates time-series analysis with image-based representation using Gramian Angular Fields (GAF). Our approach employs a dual-layer cross-channel split attention module to adaptively fuse temporal and spatial features, enabling nuanced integration of complementary information. We evaluate GAF-FusionNet on three diverse ECG datasets: ECG200, ECG5000, and the MIT-BIH Arrhythmia Database. Results demonstrate significant improvements over state-of-the-art methods, with our model achieving 94.5\%, 96.9\%, and 99.6\% accuracy on the respective datasets. Our code will soon be available at https://github.com/Cross-Innovation-Lab/GAF-FusionNet.git.
Paper Structure (25 sections, 13 equations, 1 figure, 3 tables)