Table of Contents
Fetching ...

Comparative Analysis of Data Augmentation for Clinical ECG Classification with STAR

Nader Nemati

TL;DR

This work tackles robust multi-label 12-lead ECG classification under cross-site heterogeneity by introducing Sinusoidal Time--Amplitude Resampling (STAR), a beat-aligned augmentation that time-warps and amplitude-scales each R–R segment while preserving P–QRS–T morphology. Implemented within a 1D SE–ResNet-18 framework and evaluated via a source-aware, stratified five-fold protocol on a large multi-institutional corpus, STAR demonstrates superior generalization and gains over baseline and other time-domain transforms, achieving micro-AUROC around 0.95 and macro-AUROC around 0.90 on held-out, cross-source data. The approach is invertible under mild conditions and is complemented by a transparent training workflow and open-source code, enabling reuse in practical clinical ECG pipelines. Overall, STAR provides a simple, controllable augmentation that preserves diagnostic morphology while enhancing cross-source durability and rare-class learning.

Abstract

Clinical 12-lead ECG classification remains difficult because of diverse recording conditions, overlapping pathologies, and pronounced label imbalance hinder generalization, while unconstrained augmentations risk distorting diagnostically critical morphology. In this study, Sinusoidal Time--Amplitude Resampling (STAR) is introduced as a beat-wise augmentation that operates strictly between successive R-peaks to apply controlled time warping and amplitude scaling to each R--R segment, preserving the canonical P--QRS--T order and leaving the head and tail of the trace unchanged. STAR is designed for practical pipelines and offers: (i) morphology-faithful variability that broadens training diversity without corrupting peaks or intervals; (ii) source-resilient training, improving stability across devices, sites, and cohorts without dataset-specific tuning; (iii) model-agnostic integration with common 1D SE--ResNet-style ECG encoders backbone; and (iv) better learning on rare classes via beat-level augmentation, reducing overfitting by resampling informative beats instead of duplicating whole records. In contrast to global crops, large shifts, or additive noise, STAR avoids transformations that suppress or misalign clinical landmarks. A complete Python implementation and a transparent training workflow are released, aligned with a source-aware, stratified five-fold protocol over a multi-institutional 12-lead corpus, thereby facilitating inspection and reuse. Taken together, STAR provides a simple and controllable augmentation for clinical ECG classification where trustworthy morphology, operational simplicity, and cross-source durability are essential.

Comparative Analysis of Data Augmentation for Clinical ECG Classification with STAR

TL;DR

This work tackles robust multi-label 12-lead ECG classification under cross-site heterogeneity by introducing Sinusoidal Time--Amplitude Resampling (STAR), a beat-aligned augmentation that time-warps and amplitude-scales each R–R segment while preserving P–QRS–T morphology. Implemented within a 1D SE–ResNet-18 framework and evaluated via a source-aware, stratified five-fold protocol on a large multi-institutional corpus, STAR demonstrates superior generalization and gains over baseline and other time-domain transforms, achieving micro-AUROC around 0.95 and macro-AUROC around 0.90 on held-out, cross-source data. The approach is invertible under mild conditions and is complemented by a transparent training workflow and open-source code, enabling reuse in practical clinical ECG pipelines. Overall, STAR provides a simple, controllable augmentation that preserves diagnostic morphology while enhancing cross-source durability and rare-class learning.

Abstract

Clinical 12-lead ECG classification remains difficult because of diverse recording conditions, overlapping pathologies, and pronounced label imbalance hinder generalization, while unconstrained augmentations risk distorting diagnostically critical morphology. In this study, Sinusoidal Time--Amplitude Resampling (STAR) is introduced as a beat-wise augmentation that operates strictly between successive R-peaks to apply controlled time warping and amplitude scaling to each R--R segment, preserving the canonical P--QRS--T order and leaving the head and tail of the trace unchanged. STAR is designed for practical pipelines and offers: (i) morphology-faithful variability that broadens training diversity without corrupting peaks or intervals; (ii) source-resilient training, improving stability across devices, sites, and cohorts without dataset-specific tuning; (iii) model-agnostic integration with common 1D SE--ResNet-style ECG encoders backbone; and (iv) better learning on rare classes via beat-level augmentation, reducing overfitting by resampling informative beats instead of duplicating whole records. In contrast to global crops, large shifts, or additive noise, STAR avoids transformations that suppress or misalign clinical landmarks. A complete Python implementation and a transparent training workflow are released, aligned with a source-aware, stratified five-fold protocol over a multi-institutional 12-lead corpus, thereby facilitating inspection and reuse. Taken together, STAR provides a simple and controllable augmentation for clinical ECG classification where trustworthy morphology, operational simplicity, and cross-source durability are essential.

Paper Structure

This paper contains 27 sections, 2 theorems, 13 equations, 6 figures, 9 tables, 1 algorithm.

Key Result

Proposition 1

Assume the following: Then $\mathcal{T}_{\mathrm{STAR}}$ is invertible on the class of piecewise bandlimited signals segmented by $R$, with an inverse and the reconstruction error is bounded by the interpolation error.

Figures (6)

  • Figure 1: Normalized distribution of label multiplicity per source (percent). Exact SPH percentages are shown; CinC bars can be updated with precise values if desired.
  • Figure 2: SE–ResNet-18 classifier and its SE-augmented residual block shown side by side for reference.
  • Figure 3: STAR schematic. Top: original ECG with detected R-peaks. Middle: partition into consecutive R--R segments. Bottom: per-segment time resampling and amplitude scaling by sinusoidal coefficients, followed by exact head/tail reattachment; P--QRS--T ordering preserved.
  • Figure 4: Training pipeline with proposed augmentation.
  • Figure 5: Source-aware multi-label stratification and 5-fold evaluation: each original dataset is split into five parts; each new fold is filled by a stratified portion from every source; training uses four folds, and the remaining fold is used for evaluation, cycled across all five folds.
  • ...and 1 more figures

Theorems & Definitions (2)

  • Proposition 1: STAR is invertible under mild conditions
  • Proposition 2: Frequency-domain effect of Multiply--Triangle