Table of Contents
Fetching ...

A Comprehensive Benchmark for Electrocardiogram Time-Series

Zhijiang Tang, Jiaxin Qi, Yuhua Zheng, Jianqiang Huang

TL;DR

This paper presents an in-depth investigation of ECG signals and establishes a comprehensive benchmark, which includes identifying limitations in traditional evaluation metrics for ECG analysis, and introducing a novel metric and model architecture.

Abstract

Electrocardiogram~(ECG), a key bioelectrical time-series signal, is crucial for assessing cardiac health and diagnosing various diseases. Given its time-series format, ECG data is often incorporated into pre-training datasets for large-scale time-series model training. However, existing studies often overlook its unique characteristics and specialized downstream applications, which differ significantly from other time-series data, leading to an incomplete understanding of its properties. In this paper, we present an in-depth investigation of ECG signals and establish a comprehensive benchmark, which includes (1) categorizing its downstream applications into four distinct evaluation tasks, (2) identifying limitations in traditional evaluation metrics for ECG analysis, and introducing a novel metric; (3) benchmarking state-of-the-art time-series models and proposing a new architecture. Extensive experiments demonstrate that our proposed benchmark is comprehensive and robust. The results validate the effectiveness of the proposed metric and model architecture, which establish a solid foundation for advancing research in ECG signal analysis.

A Comprehensive Benchmark for Electrocardiogram Time-Series

TL;DR

This paper presents an in-depth investigation of ECG signals and establishes a comprehensive benchmark, which includes identifying limitations in traditional evaluation metrics for ECG analysis, and introducing a novel metric and model architecture.

Abstract

Electrocardiogram~(ECG), a key bioelectrical time-series signal, is crucial for assessing cardiac health and diagnosing various diseases. Given its time-series format, ECG data is often incorporated into pre-training datasets for large-scale time-series model training. However, existing studies often overlook its unique characteristics and specialized downstream applications, which differ significantly from other time-series data, leading to an incomplete understanding of its properties. In this paper, we present an in-depth investigation of ECG signals and establish a comprehensive benchmark, which includes (1) categorizing its downstream applications into four distinct evaluation tasks, (2) identifying limitations in traditional evaluation metrics for ECG analysis, and introducing a novel metric; (3) benchmarking state-of-the-art time-series models and proposing a new architecture. Extensive experiments demonstrate that our proposed benchmark is comprehensive and robust. The results validate the effectiveness of the proposed metric and model architecture, which establish a solid foundation for advancing research in ECG signal analysis.

Paper Structure

This paper contains 13 sections, 19 equations, 8 figures, 9 tables, 1 algorithm.

Figures (8)

  • Figure 1: Illustrations of our proposed ECG benchmark. (a) Overview of four ECG evaluation tasks, including classification, detection, forecasting, and generation, across various ECG applications. (b) Comparison between traditional ECG metric MSE and our proposed metric, FFD. The results are computed using the forecasting results of Medformer wang2024medformer and Timer$^*$liutimer on the NFE dataset NFE; more details are provided in Section \ref{['analysis']}. (c) Performance comparison of ECG models, including traditional methods, large time series models, and our proposed Patch Step-by-Step Model (PSSM).
  • Figure 2: Illustrations of the architecture of our Patch Step-by-Step Model (PSSM). (a) Components of the ConvBlock. (b) Patching operation in the encoder, where the tokens are obtained by averaging the corresponding tokens from the previous layer. (c) Unpatching operation in the decoder, where tokens are generated by the weighted splitting of the corresponding tokens from the last layer. (d) Projection heads for the four ECG evaluation tasks.
  • Figure 3: Qualitative results of different models on the ECG generation task. The three rows of samples were selected from the MITDB MITDB, FEPL FEPL, and SST SST datasets, respectively. Red curves are the ground truth ECG, and blue curves are the generated ECG.
  • Figure 4: Comparisons for MSE and FFD under different temporal shifts. The top row displays ECG corresponding to various temporal shifts, where red curves are the ground truth and blue curves are the ECG with time shift. The bottom row illustrates how MSE and FFD vary across different temporal shifts. As temporal shift increases, although the semantic meaning of ECG remains consistent, the MSE increases while the FFD remains stable.
  • Figure 5: Illustrations of the attention maps for a pre-trained time-series model with and without additional ECG pre-training. Timer$^*$, which is further pre-trained on ECG datasets, exhibits an attention map with periodic patterns not observed in the raw Timer model. The samples are selected from the PTB dataset PTB.
  • ...and 3 more figures