PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation
Yanlong Chen, Mattia Orlandi, Pierangelo Maria Rapa, Simone Benatti, Luca Benini, Yawei Li
TL;DR
PhysioWave introduces a learnable wavelet front-end paired with a Transformer backbone to capture multi-scale time-frequency structures in physiological signals. Key innovations include Adaptive Wavelet Selector, Frequency-guided Masking, and Cross-Scale CAFFN, enabling robust single- and multi-modal biosignal representations and efficient linear-probing fusion for EEG/EMG/ECG tasks. Large-scale pretraining on ECG and EMG data yields state-of-the-art results across downstream ECG/EMG benchmarks and boosts multi-modal emotion and driving-behavior tasks when fused with EEG encoders. The framework shows strong generalization across modalities (ECG, EMG, EEG, PPG) and hardware scales, highlighting practical potential for wearable health monitoring and clinical diagnostics while maintaining interpretability via GradCAM-like analyses of learned features.
Abstract
Physiological signals are often corrupted by motion artifacts, baseline drift, and other low-SNR disturbances, which pose significant challenges for analysis. Additionally, these signals exhibit strong non-stationarity, with sharp peaks and abrupt changes that evolve continuously, making them difficult to represent using traditional time-domain or filtering methods. To address these issues, a novel wavelet-based approach for physiological signal analysis is presented, aiming to capture multi-scale time-frequency features in various physiological signals. Leveraging this technique, two large-scale pretrained models specific to EMG and ECG are introduced for the first time, achieving superior performance and setting new baselines in downstream tasks. Additionally, a unified multi-modal framework is constructed by integrating pretrained EEG model, where each modality is guided through its dedicated branch and fused via learnable weighted fusion. This design effectively addresses challenges such as low signal-to-noise ratio, high inter-subject variability, and device mismatch, outperforming existing methods on multi-modal tasks. The proposed wavelet-based architecture lays a solid foundation for analysis of diverse physiological signals, while the multi-modal design points to next-generation physiological signal processing with potential impact on wearable health monitoring, clinical diagnostics, and broader biomedical applications. Code and data are available at: github.com/ForeverBlue816/PhysioWave
