Table of Contents
Fetching ...

Reliability-Aware Weighted Multi-Scale Spatio-Temporal Maps for Heart Rate Monitoring

Arpan Bairagi, Rakesh Dey, Siladittya Manna, Umapada Pal

Abstract

Remote photoplethysmography (rPPG) allows for the contactless estimation of physiological signals from facial videos by analyzing subtle skin color changes. However, rPPG signals are extremely susceptible to illumination changes, motion, shadows, and specular reflections, resulting in low-quality signals in unconstrained environments. To overcome these issues, we present a Reliability-Aware Weighted Multi-Scale Spatio-Temporal (WMST) map that models pixel reliability through the suppression of environmental noises. These noises are modeled using different weighting strategies to focus on more physiologically valid areas. Leveraging the WMST map, we develop an SSL contrastive learning approach based on Swin-Unet, where positive pairs are generated from conventional rPPG signals and temporally expanded WMST maps. Moreover, we introduce a new High-High-High (HHH) wavelet map as a negative example that maintains motion and structural details while filtering out physiological information. Here, our aim is to estimate heart rate (HR), and the experiments on public rPPG benchmarks show that our approach enhances motion and illumination robustness with lower HR estimation error and higher Pearson correlation than existing Self-Supervised Learning (SSL) based rPPG methods.

Reliability-Aware Weighted Multi-Scale Spatio-Temporal Maps for Heart Rate Monitoring

Abstract

Remote photoplethysmography (rPPG) allows for the contactless estimation of physiological signals from facial videos by analyzing subtle skin color changes. However, rPPG signals are extremely susceptible to illumination changes, motion, shadows, and specular reflections, resulting in low-quality signals in unconstrained environments. To overcome these issues, we present a Reliability-Aware Weighted Multi-Scale Spatio-Temporal (WMST) map that models pixel reliability through the suppression of environmental noises. These noises are modeled using different weighting strategies to focus on more physiologically valid areas. Leveraging the WMST map, we develop an SSL contrastive learning approach based on Swin-Unet, where positive pairs are generated from conventional rPPG signals and temporally expanded WMST maps. Moreover, we introduce a new High-High-High (HHH) wavelet map as a negative example that maintains motion and structural details while filtering out physiological information. Here, our aim is to estimate heart rate (HR), and the experiments on public rPPG benchmarks show that our approach enhances motion and illumination robustness with lower HR estimation error and higher Pearson correlation than existing Self-Supervised Learning (SSL) based rPPG methods.

Paper Structure

This paper contains 20 sections, 16 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Qualitative comparison between the MST map and the proposed WMST map. Darker regions mean lower weights, and brighter regions mean higher weights (Best view by 300% zoom).
  • Figure 2: Distribution of SNR of G channel values across all ROI combinations obtained from all non-empty subsets from the set of six ROIs.
  • Figure 3: Qualitative comparison of signal quality between the MST map and the proposed WMST map (Best viewed with 300% zoom).
  • Figure 4: Qualitative comparison of predicted HR predictions between RS+rPPG and the proposed method on the VIPL-HR dataset. We have segmented all of the predictions into four segments for ease of visualization (Best viewed with 300% zoom).