TransConv-DDPM: Enhanced Diffusion Model for Generating Time-Series Data in Healthcare
Md Shahriar Kabir, Sana Alamgeer, Minakshi Debnath, Anne H. H. Ngu
TL;DR
TransConv-DDPM addresses data scarcity in healthcare time-series by enhancing diffusion-based generation with a transformer bottleneck and multi-scale convolutions, enabling robust modeling of local and global temporal dependencies. The method outperforms TimeGAN and Diffusion-TS across stick-balancing, SmartFallMM, and EEG datasets, exhibiting high distribution fidelity and predictive utility. A notable finding is the significant performance boost in fall-detection tasks when synthetic data generated by TransConv-DDPM is combined with real data, underscored by substantial gains in F1-score and accuracy. These results support the practical value of high-fidelity synthetic physiological time-series for training robust clinical AI systems and point to avenues for broader, conditional, and multi-modal data generation in healthcare.
Abstract
The lack of real-world data in clinical fields poses a major obstacle in training effective AI models for diagnostic and preventive tools in medicine. Generative AI has shown promise in increasing data volume and enhancing model training, particularly in computer vision and natural language processing (NLP) domains. However, generating physiological time-series data, a common type in medical AI applications, presents unique challenges due to its inherent complexity and variability. This paper introduces TransConv-DDPM, an enhanced generative AI method for biomechanical and physiological time-series data generation. The model employs a denoising diffusion probabilistic model (DDPM) with U-Net, multi-scale convolution modules, and a transformer layer to capture both global and local temporal dependencies. We evaluated TransConv-DDPM on three diverse datasets, generating both long and short-sequence time-series data. Quantitative comparisons against state-of-the-art methods, TimeGAN and Diffusion-TS, using four performance metrics, demonstrated promising results, particularly on the SmartFallMM and EEG datasets, where it effectively captured the more gradual temporal change patterns between data points. Additionally, a utility test on the SmartFallMM dataset revealed that adding synthetic fall data generated by TransConv-DDPM improved predictive model performance, showing a 13.64% improvement in F1-score and a 14.93% increase in overall accuracy compared to the baseline model trained solely on fall data from the SmartFallMM dataset. These findings highlight the potential of TransConv-DDPM to generate high-quality synthetic data for real-world applications.
