Table of Contents
Fetching ...

An Efficient and Flexible Deep Learning Method for Signal Delineation via Keypoints Estimation

Adrian Atienza, Jakob Bardram, Sadasivan Puthusserypady

TL;DR

This work addresses the gap between DL-based ECG delineation outputs and clinicians’ expectations by introducing KEED, a keypoint-estimation model inspired by human pose estimation. KEED outputs onset, peak, and offset coordinates for P, QRS, and T waves within each R–R interval, removing the need for post-processing and enabling a tunable $\lambda$ threshold to balance sensitivity and specificity. Implemented as a 1D HourGlass/U‑Net architecture with soft-gated skip connections, trained on a small labeled dataset, KEED achieves superior or comparable performance to state-of-the-art wavelet-based methods while delivering substantially faster inference (52x–703x speedups). The approach offers practical clinical benefits through its flexibility and speed, though its evaluation is currently limited by data availability for T-waves and unseen arrhythmias, which are expected to improve with more annotated data.

Abstract

Deep Learning (DL) methods have been used for electrocardiogram (ECG) processing in a wide variety of tasks, demonstrating good performance compared with traditional signal processing algorithms. These methods offer an efficient framework with a limited need for apriori data pre-processing and feature engineering. While several studies use this approach for ECG signal delineation, a significant gap persists between the expected and the actual outcome. Existing methods rely on a sample-to-sample classifier. However, the clinical expected outcome consists of a set of onset, offset, and peak for the different waves that compose each R-R interval. To align the actual with the expected output, it is necessary to incorporate post-processing algorithms. This counteracts two of the main advantages of DL models, since these algorithms are based on assumptions and slow down the method's performance. In this paper, we present Keypoint Estimation for Electrocardiogram Delineation (KEED), a novel DL model designed for keypoint estimation, which organically offers an output aligned with clinical expectations. By standing apart from the conventional sample-to-sample classifier, we achieve two benefits: (i) Eliminate the need for additional post-processing, and (ii) Establish a flexible framework that allows the adjustment of the threshold value considering the sensitivity-specificity tradeoff regarding the particular clinical requirements. The proposed method's performance is compared with state-of-the-art (SOTA) signal processing methods. Remarkably, KEED significantly outperforms despite being optimized with an extremely limited annotated data. In addition, KEED decreases the inference time by a factor ranging from 52x to 703x.

An Efficient and Flexible Deep Learning Method for Signal Delineation via Keypoints Estimation

TL;DR

This work addresses the gap between DL-based ECG delineation outputs and clinicians’ expectations by introducing KEED, a keypoint-estimation model inspired by human pose estimation. KEED outputs onset, peak, and offset coordinates for P, QRS, and T waves within each R–R interval, removing the need for post-processing and enabling a tunable threshold to balance sensitivity and specificity. Implemented as a 1D HourGlass/U‑Net architecture with soft-gated skip connections, trained on a small labeled dataset, KEED achieves superior or comparable performance to state-of-the-art wavelet-based methods while delivering substantially faster inference (52x–703x speedups). The approach offers practical clinical benefits through its flexibility and speed, though its evaluation is currently limited by data availability for T-waves and unseen arrhythmias, which are expected to improve with more annotated data.

Abstract

Deep Learning (DL) methods have been used for electrocardiogram (ECG) processing in a wide variety of tasks, demonstrating good performance compared with traditional signal processing algorithms. These methods offer an efficient framework with a limited need for apriori data pre-processing and feature engineering. While several studies use this approach for ECG signal delineation, a significant gap persists between the expected and the actual outcome. Existing methods rely on a sample-to-sample classifier. However, the clinical expected outcome consists of a set of onset, offset, and peak for the different waves that compose each R-R interval. To align the actual with the expected output, it is necessary to incorporate post-processing algorithms. This counteracts two of the main advantages of DL models, since these algorithms are based on assumptions and slow down the method's performance. In this paper, we present Keypoint Estimation for Electrocardiogram Delineation (KEED), a novel DL model designed for keypoint estimation, which organically offers an output aligned with clinical expectations. By standing apart from the conventional sample-to-sample classifier, we achieve two benefits: (i) Eliminate the need for additional post-processing, and (ii) Establish a flexible framework that allows the adjustment of the threshold value considering the sensitivity-specificity tradeoff regarding the particular clinical requirements. The proposed method's performance is compared with state-of-the-art (SOTA) signal processing methods. Remarkably, KEED significantly outperforms despite being optimized with an extremely limited annotated data. In addition, KEED decreases the inference time by a factor ranging from 52x to 703x.
Paper Structure (15 sections, 4 figures, 2 tables)

This paper contains 15 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Analogy between Human Pose Estimation Task and Signal Delineation task.
  • Figure 2: KEED is displayed. The ECG recording is processed by the Pan–Tompkins algorithm for R peak identification. The signal is split into R-R intervals which are passed through the DL model. After discretizing the presence/abscense of each keypoint based of the computed probability and the $\lambda$ parameter, the locations of the present keypoints are translated to match the original input.
  • Figure 3: The DL U-Net-based architecture used is displayed. It consists on an encoder which synthesises the information contained in the input within a dense latent space. This space is used by a decoder for the reconstruction of an equivalent-dimension output expanded with K channels, being K the number of keypoints to be identified. Each output channel represents the probability of the respective keypoint to be located in each sample. Both encoder and decoder are linked through residual connections.
  • Figure 4: Influence of $\lambda$ value in False Positives/False Negatives Trade-off.