Table of Contents
Fetching ...

PTB-XL-Image-17K: A Large-Scale Synthetic ECG Image Dataset with Comprehensive Ground Truth for Deep Learning-Based Digitization

Naqcho Ali Mehdi

TL;DR

This work addresses the lack of large-scale, ground-truth ECG image datasets suitable for end-to-end digitization by introducing PTB-XL-Image-17K, a synthetic dataset of 17,271 high-quality 12-lead ECG images with complete ground truth across images, segmentation masks, time-series signals, YOLO-format bounding boxes, and rich metadata. It provides an open-source generation framework with controllable parameters to simulate diverse recording conditions, including lead regions and lead-name annotations, and validates high-fidelity signal reconstruction and accurate localization. The dataset supports end-to-end digitization tasks and overlapping waveform research, offering robust baselines for lead detection, waveform segmentation, and pixel-to-signal calibration, with strong performance metrics (IoU >0.90, correlation >0.998). By making both data and framework publicly available, it aims to accelerate development of automated ECG digitization pipelines applicable to legacy archives, telemedicine, and multi-modal learning, while outlining future extensions to more layouts and real-scanned validation.

Abstract

Electrocardiogram (ECG) digitization-converting paper-based or scanned ECG images back into time-series signals-is critical for leveraging decades of legacy clinical data in modern deep learning applications. However, progress has been hindered by the lack of large-scale datasets providing both ECG images and their corresponding ground truth signals with comprehensive annotations. We introduce PTB-XL-Image-17K, a complete synthetic ECG image dataset comprising 17,271 high-quality 12-lead ECG images generated from the PTB-XL signal database. Our dataset uniquely provides five complementary data types per sample: (1) realistic ECG images with authentic grid patterns and annotations (50% with visible grid, 50% without), (2) pixel-level segmentation masks, (3) ground truth time-series signals, (4) bounding box annotations in YOLO format for both lead regions and lead name labels, and (5) comprehensive metadata including visual parameters and patient information. We present an open-source Python framework enabling customizable dataset generation with controllable parameters including paper speed (25/50 mm/s), voltage scale (5/10 mm/mV), sampling rate (500 Hz), grid appearance (4 colors), and waveform characteristics. The dataset achieves 100% generation success rate with an average processing time of 1.35 seconds per sample. PTB-XL-Image-17K addresses critical gaps in ECG digitization research by providing the first large-scale resource supporting the complete pipeline: lead detection, waveform segmentation, and signal extraction with full ground truth for rigorous evaluation. The dataset, generation framework, and documentation are publicly available at https://github.com/naqchoalimehdi/PTB-XL-Image-17K and https://doi.org/10.5281/zenodo.18197519.

PTB-XL-Image-17K: A Large-Scale Synthetic ECG Image Dataset with Comprehensive Ground Truth for Deep Learning-Based Digitization

TL;DR

This work addresses the lack of large-scale, ground-truth ECG image datasets suitable for end-to-end digitization by introducing PTB-XL-Image-17K, a synthetic dataset of 17,271 high-quality 12-lead ECG images with complete ground truth across images, segmentation masks, time-series signals, YOLO-format bounding boxes, and rich metadata. It provides an open-source generation framework with controllable parameters to simulate diverse recording conditions, including lead regions and lead-name annotations, and validates high-fidelity signal reconstruction and accurate localization. The dataset supports end-to-end digitization tasks and overlapping waveform research, offering robust baselines for lead detection, waveform segmentation, and pixel-to-signal calibration, with strong performance metrics (IoU >0.90, correlation >0.998). By making both data and framework publicly available, it aims to accelerate development of automated ECG digitization pipelines applicable to legacy archives, telemedicine, and multi-modal learning, while outlining future extensions to more layouts and real-scanned validation.

Abstract

Electrocardiogram (ECG) digitization-converting paper-based or scanned ECG images back into time-series signals-is critical for leveraging decades of legacy clinical data in modern deep learning applications. However, progress has been hindered by the lack of large-scale datasets providing both ECG images and their corresponding ground truth signals with comprehensive annotations. We introduce PTB-XL-Image-17K, a complete synthetic ECG image dataset comprising 17,271 high-quality 12-lead ECG images generated from the PTB-XL signal database. Our dataset uniquely provides five complementary data types per sample: (1) realistic ECG images with authentic grid patterns and annotations (50% with visible grid, 50% without), (2) pixel-level segmentation masks, (3) ground truth time-series signals, (4) bounding box annotations in YOLO format for both lead regions and lead name labels, and (5) comprehensive metadata including visual parameters and patient information. We present an open-source Python framework enabling customizable dataset generation with controllable parameters including paper speed (25/50 mm/s), voltage scale (5/10 mm/mV), sampling rate (500 Hz), grid appearance (4 colors), and waveform characteristics. The dataset achieves 100% generation success rate with an average processing time of 1.35 seconds per sample. PTB-XL-Image-17K addresses critical gaps in ECG digitization research by providing the first large-scale resource supporting the complete pipeline: lead detection, waveform segmentation, and signal extraction with full ground truth for rigorous evaluation. The dataset, generation framework, and documentation are publicly available at https://github.com/naqchoalimehdi/PTB-XL-Image-17K and https://doi.org/10.5281/zenodo.18197519.
Paper Structure (50 sections, 2 equations, 5 figures, 4 tables)

This paper contains 50 sections, 2 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Project folder structure of the PTB-XL-Image-17K generation framework showing the modular organization of source code, configuration files, and output directories.
  • Figure 2: Representative ECG images from PTB-XL-Image-17K showing the two grid visibility conditions. (a) Sample with visible red grid at 0.8 opacity showing 1mm small boxes and 5mm bold boxes. (b) Sample without grid on clean white background. Both samples show the same 12×1 layout with lead names, calibration pulses, and metadata header. The balanced distribution trains models to handle both scenarios commonly encountered in clinical practice.
  • Figure 3: Generated dataset folder structure showing the organization of train, validation, and test splits with five output types per sample: images, masks, signals, metadata, and YOLO labels.
  • Figure 4: Example 12×1 layout showing vertical lead arrangement with calibration pulses, lead labels, and metadata header.
  • Figure 5: Frequency domain analysis showing filter performance. Top: Raw signal spectrum. Bottom: Filtered signal spectrum. Diagnostic frequency band (0.5-40 Hz) is preserved while baseline wander and high-frequency noise are removed.