ECG Signal Denoising Using Multi-scale Patch Embedding and Transformers
Ding Zhu, Vishnu Kabir Chhabra, Mohammad Mahdi Khalili
TL;DR
This work tackles ECG denoising under diverse noise conditions by introducing a Transformer-based denoising model that employs multi-scale patch embeddings from a 1D convolutional front-end. The architecture follows a U-Net-like encoder–decoder with Transformer blocks and patch merging/separating, operating on multi-resolution temporal features. Experimental results on the MIT-BIH dataset show superior denoising performance (higher SNR, lower RMSE) across noise types compared to baselines, and downstream classification tasks with denoised signals achieve superior accuracy relative to noisy inputs. The approach demonstrates effective capture of multi-scale temporal noise components, with practical implications for robust wearable ECG monitoring and downstream diagnostics.
Abstract
Cardiovascular disease is a major life-threatening condition that is commonly monitored using electrocardiogram (ECG) signals. However, these signals are often contaminated by various types of noise at different intensities, significantly interfering with downstream tasks. Therefore, denoising ECG signals and increasing the signal-to-noise ratio is crucial for cardiovascular monitoring. In this paper, we propose a deep learning method that combines a one-dimensional convolutional layer with transformer architecture for denoising ECG signals. The convolutional layer processes the ECG signal by various kernel/patch sizes and generates an embedding called multi-scale patch embedding. The embedding then is used as the input of a transformer network and enhances the capability of the transformer for denoising the ECG signal.
