Table of Contents
Fetching ...

Generating Realistic Multi-Beat ECG Signals

Paul Pöhl, Viktor Schlegel, Hao Li, Anil Bharath

TL;DR

This work tackles generating clinically realistic long-form ECG time series, addressing the limitations of diffusion models that perform well on short beats but falter for extended sequences. It introduces a three-layer pipeline—beat-level diffusion for high-fidelity beats, multivariate feature generation for inter-beat dynamics, and feature-guided assembly to stitch beats into coherent long sequences—preserving inter-beat cues such as $R$-$R$ intervals. Comprehensive evaluation shows maintained beat morphology, faithful inter-beat feature distributions, and improved arrhythmia classification performance when training on synthetic long-form data compared with end-to-end diffusion baselines. This approach enables multi-minute synthetic ECG generation with potential for privacy-preserving data augmentation and enhanced clinical analytics.

Abstract

Generating synthetic ECG data has numerous applications in healthcare, from educational purposes to simulating scenarios and forecasting trends. While recent diffusion models excel at generating short ECG segments, they struggle with longer sequences needed for many clinical applications. This paper proposes a novel three-layer synthesis framework for generating realistic long-form ECG signals. We first generate high-fidelity single beats using a diffusion model, then synthesize inter-beat features preserving critical temporal dependencies, and finally assemble beats into coherent long sequences using feature-guided matching. Our comprehensive evaluation demonstrates that the resulting synthetic ECGs maintain both beat-level morphological fidelity and clinically relevant inter-beat relationships. In arrhythmia classification tasks, our long-form synthetic ECGs significantly outperform end-to-end long-form ECG generation using the diffusion model, highlighting their potential for increasing utility for downstream applications. The approach enables generation of unprecedented multi-minute ECG sequences while preserving essential diagnostic characteristics.

Generating Realistic Multi-Beat ECG Signals

TL;DR

This work tackles generating clinically realistic long-form ECG time series, addressing the limitations of diffusion models that perform well on short beats but falter for extended sequences. It introduces a three-layer pipeline—beat-level diffusion for high-fidelity beats, multivariate feature generation for inter-beat dynamics, and feature-guided assembly to stitch beats into coherent long sequences—preserving inter-beat cues such as - intervals. Comprehensive evaluation shows maintained beat morphology, faithful inter-beat feature distributions, and improved arrhythmia classification performance when training on synthetic long-form data compared with end-to-end diffusion baselines. This approach enables multi-minute synthetic ECG generation with potential for privacy-preserving data augmentation and enhanced clinical analytics.

Abstract

Generating synthetic ECG data has numerous applications in healthcare, from educational purposes to simulating scenarios and forecasting trends. While recent diffusion models excel at generating short ECG segments, they struggle with longer sequences needed for many clinical applications. This paper proposes a novel three-layer synthesis framework for generating realistic long-form ECG signals. We first generate high-fidelity single beats using a diffusion model, then synthesize inter-beat features preserving critical temporal dependencies, and finally assemble beats into coherent long sequences using feature-guided matching. Our comprehensive evaluation demonstrates that the resulting synthetic ECGs maintain both beat-level morphological fidelity and clinically relevant inter-beat relationships. In arrhythmia classification tasks, our long-form synthetic ECGs significantly outperform end-to-end long-form ECG generation using the diffusion model, highlighting their potential for increasing utility for downstream applications. The approach enables generation of unprecedented multi-minute ECG sequences while preserving essential diagnostic characteristics.

Paper Structure

This paper contains 24 sections, 1 equation, 6 figures, 8 tables.

Figures (6)

  • Figure 1: High-level methodology overview: (a) Beat-level ECG generation using a diffusion model; (b) Feature extraction and generation of synthetic features; (c) Long-sequence assembly via feature-beat matching.
  • Figure 2: A comparison of the mean of individual generated beats vs original beats with their standard deviations.
  • Figure 3: Heatmap of the difference in probability density (Generated - Original) at each time step. Red regions indicate higher synthetic density, while blue regions indicate higher original density.
  • Figure 4: Overview of pairwise correlations of features from the original and synthetic ECG data. The synthetic data largely preserves important correlations.
  • Figure 5: Comparison of randomly chosen long-form ECGs. We compare an original signal against a synthetic signal using our approach and 2 signal generated End-to-end using the BRIDGE li2025bridge diffusion model and timeVQVAE lee2023vectorquantizedtimeseries(bottom).
  • ...and 1 more figures