Comparative Analysis of Data Augmentation for Clinical ECG Classification with STAR
Nader Nemati
TL;DR
This work tackles robust multi-label 12-lead ECG classification under cross-site heterogeneity by introducing Sinusoidal Time--Amplitude Resampling (STAR), a beat-aligned augmentation that time-warps and amplitude-scales each R–R segment while preserving P–QRS–T morphology. Implemented within a 1D SE–ResNet-18 framework and evaluated via a source-aware, stratified five-fold protocol on a large multi-institutional corpus, STAR demonstrates superior generalization and gains over baseline and other time-domain transforms, achieving micro-AUROC around 0.95 and macro-AUROC around 0.90 on held-out, cross-source data. The approach is invertible under mild conditions and is complemented by a transparent training workflow and open-source code, enabling reuse in practical clinical ECG pipelines. Overall, STAR provides a simple, controllable augmentation that preserves diagnostic morphology while enhancing cross-source durability and rare-class learning.
Abstract
Clinical 12-lead ECG classification remains difficult because of diverse recording conditions, overlapping pathologies, and pronounced label imbalance hinder generalization, while unconstrained augmentations risk distorting diagnostically critical morphology. In this study, Sinusoidal Time--Amplitude Resampling (STAR) is introduced as a beat-wise augmentation that operates strictly between successive R-peaks to apply controlled time warping and amplitude scaling to each R--R segment, preserving the canonical P--QRS--T order and leaving the head and tail of the trace unchanged. STAR is designed for practical pipelines and offers: (i) morphology-faithful variability that broadens training diversity without corrupting peaks or intervals; (ii) source-resilient training, improving stability across devices, sites, and cohorts without dataset-specific tuning; (iii) model-agnostic integration with common 1D SE--ResNet-style ECG encoders backbone; and (iv) better learning on rare classes via beat-level augmentation, reducing overfitting by resampling informative beats instead of duplicating whole records. In contrast to global crops, large shifts, or additive noise, STAR avoids transformations that suppress or misalign clinical landmarks. A complete Python implementation and a transparent training workflow are released, aligned with a source-aware, stratified five-fold protocol over a multi-institutional 12-lead corpus, thereby facilitating inspection and reuse. Taken together, STAR provides a simple and controllable augmentation for clinical ECG classification where trustworthy morphology, operational simplicity, and cross-source durability are essential.
