Simulator and Experience Enhanced Diffusion Model for Comprehensive ECG Generation
Xiaoda Wang, Kaiqiao Han, Yuhao Xu, Xiao Luo, Yizhou Sun, Wei Wang, Carl Yang
TL;DR
SE-Diff tackles text-to-ECG generation by integrating a lightweight ODE-based ECG simulator into a latent-diffusion framework and augmenting conditioning with retrieval-augmented clinical knowledge. The model operates in a VAE latent space with a Beat Decoder guiding simulator-informed regularizers, ensuring physiologically plausible waveforms and coherent inter-lead relationships. An LLM-powered retrieval pipeline injects experience-based clinical patterns from EHRs, improving semantic alignment between text prompts and generated ECGs. On real-world data, SE-Diff achieves superior signal fidelity, physiological realism, and diagnostic-text alignment, and it also enhances downstream ECG classification when used for data augmentation. This approach represents a principled path toward physiologically grounded, clinically informed generative ECG models with practical utility for data expansion and privacy-preserving sharing.
Abstract
Cardiovascular disease (CVD) is a leading cause of mortality worldwide. Electrocardiograms (ECGs) are the most widely used non-invasive tool for cardiac assessment, yet large, well-annotated ECG corpora are scarce due to cost, privacy, and workflow constraints. Generating ECGs can be beneficial for the mechanistic understanding of cardiac electrical activity, enable the construction of large, heterogeneous, and unbiased datasets, and facilitate privacy-preserving data sharing. Generating realistic ECG signals from clinical context is important yet underexplored. Recent work has leveraged diffusion models for text-to-ECG generation, but two challenges remain: (i) existing methods often overlook the physiological simulator knowledge of cardiac activity; and (ii) they ignore broader, experience-based clinical knowledge grounded in real-world practice. To address these gaps, we propose SE-Diff, a novel physiological simulator and experience enhanced diffusion model for comprehensive ECG generation. SE-Diff integrates a lightweight ordinary differential equation (ODE)-based ECG simulator into the diffusion process via a beat decoder and simulator-consistent constraints, injecting mechanistic priors that promote physiologically plausible waveforms. In parallel, we design an LLM-powered experience retrieval-augmented strategy to inject clinical knowledge, providing more guidance for ECG generation. Extensive experiments on real-world ECG datasets demonstrate that SE-Diff improves both signal fidelity and text-ECG semantic alignment over baselines, proving its superiority for text-to-ECG generation. We further show that the simulator-based and experience-based knowledge also benefit downstream ECG classification.
