Table of Contents
Fetching ...

Automated Medical Report Generation for ECG Data: Bridging Medical Text and Signal Processing with Deep Learning

Amnon Bleich, Antje Linnemann, Bjoern H. Diem, Tim OF Conrad

TL;DR

This paper addresses the lack of scalable methods for automated text generation from ECG signals by adapting image-captioning encoder–decoder architectures to ECG data. It uses a 1D ResNet34 encoder coupled with Transformer or LSTM decoders and leverages free-text clinical reports from PTB-XL and ICM datasets, including steps for data preprocessing, translation, and abbreviation unification. The key findings show that the proposed models achieve state-of-the-art METEOR scores on PTB-XL official splits (notably around 55.5% for the best configuration, well above the reference), with translation and careful preprocessing providing significant gains; pre-training the encoder on rhythm labels offers limited benefit. The study provides a reproducible benchmark and demonstrates robustness across diverse ECG formats, laying groundwork for future expansion to larger datasets and other 1D medical signals, such as EEG, as well as potential clinical decision support applications.

Abstract

Recent advances in deep learning and natural language generation have significantly improved image captioning, enabling automated, human-like descriptions for visual content. In this work, we apply these captioning techniques to generate clinician-like interpretations of ECG data. This study leverages existing ECG datasets accompanied by free-text reports authored by healthcare professionals (HCPs) as training data. These reports, while often inconsistent, provide a valuable foundation for automated learning. We introduce an encoder-decoder-based method that uses these reports to train models to generate detailed descriptions of ECG episodes. This represents a significant advancement in ECG analysis automation, with potential applications in zero-shot classification and automated clinical decision support. The model is tested on various datasets, including both 1- and 12-lead ECGs. It significantly outperforms the state-of-the-art reference model by Qiu et al., achieving a METEOR score of 55.53% compared to 24.51% achieved by the reference model. Furthermore, several key design choices are discussed, providing a comprehensive overview of current challenges and innovations in this domain. The source codes for this research are publicly available in our Git repository https://git.zib.de/ableich/ecg-comment-generation-public

Automated Medical Report Generation for ECG Data: Bridging Medical Text and Signal Processing with Deep Learning

TL;DR

This paper addresses the lack of scalable methods for automated text generation from ECG signals by adapting image-captioning encoder–decoder architectures to ECG data. It uses a 1D ResNet34 encoder coupled with Transformer or LSTM decoders and leverages free-text clinical reports from PTB-XL and ICM datasets, including steps for data preprocessing, translation, and abbreviation unification. The key findings show that the proposed models achieve state-of-the-art METEOR scores on PTB-XL official splits (notably around 55.5% for the best configuration, well above the reference), with translation and careful preprocessing providing significant gains; pre-training the encoder on rhythm labels offers limited benefit. The study provides a reproducible benchmark and demonstrates robustness across diverse ECG formats, laying groundwork for future expansion to larger datasets and other 1D medical signals, such as EEG, as well as potential clinical decision support applications.

Abstract

Recent advances in deep learning and natural language generation have significantly improved image captioning, enabling automated, human-like descriptions for visual content. In this work, we apply these captioning techniques to generate clinician-like interpretations of ECG data. This study leverages existing ECG datasets accompanied by free-text reports authored by healthcare professionals (HCPs) as training data. These reports, while often inconsistent, provide a valuable foundation for automated learning. We introduce an encoder-decoder-based method that uses these reports to train models to generate detailed descriptions of ECG episodes. This represents a significant advancement in ECG analysis automation, with potential applications in zero-shot classification and automated clinical decision support. The model is tested on various datasets, including both 1- and 12-lead ECGs. It significantly outperforms the state-of-the-art reference model by Qiu et al., achieving a METEOR score of 55.53% compared to 24.51% achieved by the reference model. Furthermore, several key design choices are discussed, providing a comprehensive overview of current challenges and innovations in this domain. The source codes for this research are publicly available in our Git repository https://git.zib.de/ableich/ecg-comment-generation-public

Paper Structure

This paper contains 30 sections, 4 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Overview of our proposed method for automated ECG textual summary generation. The input consists of multi-lead ECG data, from which our model generates a textual summary. Additionally, the method identifies and highlights supporting diagnostic features in the ECG signal (e.g., "extrasystole," "anterior," and "hemiblock") as part of the generated explanation, providing interpretability to support clinical decision-making. Note that the ground truth shown at the top in this example is not available to the model.
  • Figure 2: Architectures of our (a) Transformer-based and (b) LSTM-based models for ECG signal analysis.
  • Figure : METEOR Score Progression for Main Experiments on PTB-XL (Official Splits)