Table of Contents
Fetching ...

NMCSE: Noise-Robust Multi-Modal Coupling Signal Estimation Method via Optimal Transport for Cardiovascular Disease Detection

Peihong Zhang, Zhixin Li, Rui Sang, Yuxuan Liu, Yiqiang Cai, Yizhou Tan, Shengchen Li

TL;DR

The paper tackles robust coupling-signal estimation between ECG and PCG for cardiovascular disease detection under real-world noise. It introduces Noise-Robust Multi-Modal Coupling Signal Estimation (NMCSE), formulates coupling estimation as a distribution-matching problem using optimal transport, and integrates it with a Temporal-Spatial Feature Extraction (TSFE) network for effective multi-modal fusion. Empirical results on PhysioNet/CinC 2016 and EPHNOGRAM demonstrate that NMCSE outperforms deconvolution-based methods in both estimation quality and intra-state stability, achieving 97.38% accuracy and 0.98 AUC. This work provides a practical, noise-robust pathway for reliable wearable/ambulatory multi-modal cardiac analysis by explicitly modeling the electromechanical coupling.

Abstract

The coupling signal refers to a latent physiological signal that characterizes the transformation from cardiac electrical excitation, captured by the electrocardiogram (ECG), to mechanical contraction, recorded by the phonocardiogram (PCG). By encoding the temporal and functional interplay between electrophysiological and hemodynamic events, it serves as an intrinsic link between modalities and offers a unified representation of cardiac function, with strong potential to enhance multi-modal cardiovascular disease (CVD) detection. However, existing coupling signal estimation methods remain highly vulnerable to noise, particularly in real-world clinical and physiological settings, which undermines their robustness and limits practical value. In this study, we propose Noise-Robust Multi-Modal Coupling Signal Estimation (NMCSE), which reformulates coupling signal estimation as a distribution matching problem solved via optimal transport. By jointly aligning amplitude and timing, NMCSE avoids noise amplification and enables stable signal estimation. When integrated into a Temporal-Spatial Feature Extraction (TSFE) network, the estimated coupling signal effectively enhances multi-modal fusion for more accurate CVD detection. To evaluate robustness under real-world conditions, we design two complementary experiments targeting distinct sources of noise. The first uses the PhysioNet 2016 dataset with simulated hospital noise to assess the resilience of NMCSE to clinical interference. The second leverages the EPHNOGRAM dataset with motion-induced physiological noise to evaluate intra-state estimation stability across activity levels. Experimental results show that NMCSE consistently outperforms existing methods under both clinical and physiological noise, highlighting it as a noise-robust estimation approach that enables reliable multi-modal cardiac detection in real-world conditions.

NMCSE: Noise-Robust Multi-Modal Coupling Signal Estimation Method via Optimal Transport for Cardiovascular Disease Detection

TL;DR

The paper tackles robust coupling-signal estimation between ECG and PCG for cardiovascular disease detection under real-world noise. It introduces Noise-Robust Multi-Modal Coupling Signal Estimation (NMCSE), formulates coupling estimation as a distribution-matching problem using optimal transport, and integrates it with a Temporal-Spatial Feature Extraction (TSFE) network for effective multi-modal fusion. Empirical results on PhysioNet/CinC 2016 and EPHNOGRAM demonstrate that NMCSE outperforms deconvolution-based methods in both estimation quality and intra-state stability, achieving 97.38% accuracy and 0.98 AUC. This work provides a practical, noise-robust pathway for reliable wearable/ambulatory multi-modal cardiac analysis by explicitly modeling the electromechanical coupling.

Abstract

The coupling signal refers to a latent physiological signal that characterizes the transformation from cardiac electrical excitation, captured by the electrocardiogram (ECG), to mechanical contraction, recorded by the phonocardiogram (PCG). By encoding the temporal and functional interplay between electrophysiological and hemodynamic events, it serves as an intrinsic link between modalities and offers a unified representation of cardiac function, with strong potential to enhance multi-modal cardiovascular disease (CVD) detection. However, existing coupling signal estimation methods remain highly vulnerable to noise, particularly in real-world clinical and physiological settings, which undermines their robustness and limits practical value. In this study, we propose Noise-Robust Multi-Modal Coupling Signal Estimation (NMCSE), which reformulates coupling signal estimation as a distribution matching problem solved via optimal transport. By jointly aligning amplitude and timing, NMCSE avoids noise amplification and enables stable signal estimation. When integrated into a Temporal-Spatial Feature Extraction (TSFE) network, the estimated coupling signal effectively enhances multi-modal fusion for more accurate CVD detection. To evaluate robustness under real-world conditions, we design two complementary experiments targeting distinct sources of noise. The first uses the PhysioNet 2016 dataset with simulated hospital noise to assess the resilience of NMCSE to clinical interference. The second leverages the EPHNOGRAM dataset with motion-induced physiological noise to evaluate intra-state estimation stability across activity levels. Experimental results show that NMCSE consistently outperforms existing methods under both clinical and physiological noise, highlighting it as a noise-robust estimation approach that enables reliable multi-modal cardiac detection in real-world conditions.

Paper Structure

This paper contains 32 sections, 6 theorems, 14 equations, 4 figures, 6 tables, 1 algorithm.

Key Result

Theorem 1

There exists a linear time-invariant system characterized by impulse response $h(t)$ such that, under ideal conditions, the PCG signal $X_{\mathrm{PCG}}(t)$ can be expressed as the convolution of the ECG signal $X_{\mathrm{ECG}}(t)$ with $h(t)$: In realistic clinical environments with noise, the observed PCG signal becomes: where $\epsilon(t)$ represents additive noise from various sources.

Figures (4)

  • Figure 1: Illustration of the relationship between ECG, PCG, and the coupling signal. ECG reflects the heart’s electrical activity, while PCG conveys its mechanical response. The coupling signal captures the dynamic transformation between them.
  • Figure 2: Temporal-Spatial Feature Extraction Block.
  • Figure 3: Multi-Modal Processing Pipeline. Left: Signal spectrogram and time slices as inputs to spatial and temporal branches. Right: ECG, PCG, and coupling signals processed via TSFE, fused, and classified.
  • Figure 4: Spectral coherence analysis of NMCSE method. (A) Magnitude-squared coherence between reference and estimated signals at 10 dB SNR, showing NMCSE's superior preservation in the mid-frequency band (10–100 Hz). (B) Coherence difference map across frequencies and SNRs, highlighting advantages of NMCSE under noise. (C) Average coherence preservation by frequency band at 10 dB SNR, with NMCSE achieving a 28% gain in the mid-frequency band.

Theorems & Definitions (10)

  • Theorem 1: ECG-PCG Transformation Model
  • Theorem 2: Error Amplification in Deconvolution
  • proof : Proof Sketch
  • Theorem 3: Sinkhorn Distance Properties
  • proof : Proof Sketch
  • Theorem 4: Optimal Cost Function
  • proof : Proof Sketch
  • Theorem 5: NMCSE Convergence
  • proof : Proof Sketch
  • Theorem 6: Error Advantage