rPPG-SysDiaGAN: Systolic-Diastolic Feature Localization in rPPG Using Generative Adversarial Network with Multi-Domain Discriminator
Banafsheh Adami, Nima Karimian
TL;DR
This work tackles the incomplete waveform reconstruction in rPPG by introducing a GAN-based Swin-AUnet framework with multi-domain discriminators that enforce time, frequency, and second-derivative fidelity. By leveraging a Swin Transformer–enhanced U-Net and PatchGAN discriminators, the method reconstructs not only heart rate but the full PPG morphology, including systolic and diastolic components, using loss terms for sparsity, variance, and differentiable alignment (Soft-DTW). Across five diverse datasets, the approach yields substantial improvements in HR accuracy and waveform similarity (e.g., ρ ≈ $0.915$, FD ≈ $0.248$), with strong cross-dataset generalization and ablations confirming the contribution of each component. The proposed framework offers a practical, supervised pathway to noninvasively monitor cardiovascular signals from video with enhanced physiological interpretability and potential clinical value.
Abstract
Remote photoplethysmography (rPPG) offers a novel approach to noninvasive monitoring of vital signs, such as respiratory rate, utilizing a camera. Although several supervised and self-supervised methods have been proposed, they often fail to accurately reconstruct the PPG signal, particularly in distinguishing between systolic and diastolic components. Their primary focus tends to be solely on extracting heart rate, which may not accurately represent the complete PPG signal. To address this limitation, this paper proposes a novel deep learning architecture using Generative Adversarial Networks by introducing multi-discriminators to extract rPPG signals from facial videos. These discriminators focus on the time domain, the frequency domain, and the second derivative of the original time domain signal. The discriminator integrates four loss functions: variance loss to mitigate local minima caused by noise; dynamic time warping loss to address local minima induced by alignment and sequences of variable lengths; Sparsity Loss for heart rate adjustment, and Variance Loss to ensure a uniform distribution across the desired frequency domain and time interval between systolic and diastolic phases of the PPG signal.
