Table of Contents
Fetching ...

Transfer Learning in ECG Diagnosis: Is It Effective?

Cuong V. Nguyen, Cuong D. Do

TL;DR

This study conducts the first extensive empirical study on the effectiveness of transfer learning in multi-label ECG classification, by investigating comparing the fine-tuning performance with that of training from scratch, covering a variety of ECG datasets and deep neural networks.

Abstract

The adoption of deep learning in ECG diagnosis is often hindered by the scarcity of large, well-labeled datasets in real-world scenarios, leading to the use of transfer learning to leverage features learned from larger datasets. Yet the prevailing assumption that transfer learning consistently outperforms training from scratch has never been systematically validated. In this study, we conduct the first extensive empirical study on the effectiveness of transfer learning in multi-label ECG classification, by investigating comparing the fine-tuning performance with that of training from scratch, covering a variety of ECG datasets and deep neural networks. We confirm that fine-tuning is the preferable choice for small downstream datasets; however, when the dataset is sufficiently large, training from scratch can achieve comparable performance, albeit requiring a longer training time to catch up. Furthermore, we find that transfer learning exhibits better compatibility with convolutional neural networks than with recurrent neural networks, which are the two most prevalent architectures for time-series ECG applications. Our results underscore the importance of transfer learning in ECG diagnosis, yet depending on the amount of available data, researchers may opt not to use it, considering the non-negligible cost associated with pre-training.

Transfer Learning in ECG Diagnosis: Is It Effective?

TL;DR

This study conducts the first extensive empirical study on the effectiveness of transfer learning in multi-label ECG classification, by investigating comparing the fine-tuning performance with that of training from scratch, covering a variety of ECG datasets and deep neural networks.

Abstract

The adoption of deep learning in ECG diagnosis is often hindered by the scarcity of large, well-labeled datasets in real-world scenarios, leading to the use of transfer learning to leverage features learned from larger datasets. Yet the prevailing assumption that transfer learning consistently outperforms training from scratch has never been systematically validated. In this study, we conduct the first extensive empirical study on the effectiveness of transfer learning in multi-label ECG classification, by investigating comparing the fine-tuning performance with that of training from scratch, covering a variety of ECG datasets and deep neural networks. We confirm that fine-tuning is the preferable choice for small downstream datasets; however, when the dataset is sufficiently large, training from scratch can achieve comparable performance, albeit requiring a longer training time to catch up. Furthermore, we find that transfer learning exhibits better compatibility with convolutional neural networks than with recurrent neural networks, which are the two most prevalent architectures for time-series ECG applications. Our results underscore the importance of transfer learning in ECG diagnosis, yet depending on the amount of available data, researchers may opt not to use it, considering the non-negligible cost associated with pre-training.
Paper Structure (13 sections, 6 figures, 2 tables)

This paper contains 13 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Performance comparison of fine-tuning and training from scratch, with three upstream datasets, six models, and four downstream datasets. In each chart, six symbols depict the average $f_1$-scores for the respective models, and the bar shows the mean average score across these six models.
  • Figure 3: Fine-tuning improvement of the three ResNets with varying downstream dataset size.
  • Figure 4: Performances of ResNet1d18 during fine-tuning and training from scratch. Three rows represent three upstream datasets: PTB-XL, CPSC2018, and Georgia, respectively.
  • Figure 5: Performances of ResNet1d50 during fine-tuning and training from scratch.
  • Figure 6: Performances of ResNet1d101 during fine-tuning and training from scratch.
  • ...and 1 more figures