NMCSE: Noise-Robust Multi-Modal Coupling Signal Estimation Method via Optimal Transport for Cardiovascular Disease Detection
Peihong Zhang, Zhixin Li, Rui Sang, Yuxuan Liu, Yiqiang Cai, Yizhou Tan, Shengchen Li
TL;DR
The paper tackles robust coupling-signal estimation between ECG and PCG for cardiovascular disease detection under real-world noise. It introduces Noise-Robust Multi-Modal Coupling Signal Estimation (NMCSE), formulates coupling estimation as a distribution-matching problem using optimal transport, and integrates it with a Temporal-Spatial Feature Extraction (TSFE) network for effective multi-modal fusion. Empirical results on PhysioNet/CinC 2016 and EPHNOGRAM demonstrate that NMCSE outperforms deconvolution-based methods in both estimation quality and intra-state stability, achieving 97.38% accuracy and 0.98 AUC. This work provides a practical, noise-robust pathway for reliable wearable/ambulatory multi-modal cardiac analysis by explicitly modeling the electromechanical coupling.
Abstract
The coupling signal refers to a latent physiological signal that characterizes the transformation from cardiac electrical excitation, captured by the electrocardiogram (ECG), to mechanical contraction, recorded by the phonocardiogram (PCG). By encoding the temporal and functional interplay between electrophysiological and hemodynamic events, it serves as an intrinsic link between modalities and offers a unified representation of cardiac function, with strong potential to enhance multi-modal cardiovascular disease (CVD) detection. However, existing coupling signal estimation methods remain highly vulnerable to noise, particularly in real-world clinical and physiological settings, which undermines their robustness and limits practical value. In this study, we propose Noise-Robust Multi-Modal Coupling Signal Estimation (NMCSE), which reformulates coupling signal estimation as a distribution matching problem solved via optimal transport. By jointly aligning amplitude and timing, NMCSE avoids noise amplification and enables stable signal estimation. When integrated into a Temporal-Spatial Feature Extraction (TSFE) network, the estimated coupling signal effectively enhances multi-modal fusion for more accurate CVD detection. To evaluate robustness under real-world conditions, we design two complementary experiments targeting distinct sources of noise. The first uses the PhysioNet 2016 dataset with simulated hospital noise to assess the resilience of NMCSE to clinical interference. The second leverages the EPHNOGRAM dataset with motion-induced physiological noise to evaluate intra-state estimation stability across activity levels. Experimental results show that NMCSE consistently outperforms existing methods under both clinical and physiological noise, highlighting it as a noise-robust estimation approach that enables reliable multi-modal cardiac detection in real-world conditions.
