PhysFlow: Skin tone transfer for remote heart rate estimation through conditional normalizing flows
Joaquim Comas, Antonia Alomar, Adria Ruiz, Federico Sukno
TL;DR
Camera-based remote heart rate estimation suffers from skin-tone bias due to underrepresented darker tones in public datasets. The paper introduces PhysFlow, which leverages conditional normalizing flows to disentangle and transfer skin tone in facial videos, conditioning on a bi-dimensional CIELAB skin-tone representation and enabling end-to-end training with both original and augmented data. The method combines a 3D-CNN auto-encoder, a c-CNF module, and an rPPG estimator, preserving pulsatile signals while enabling skin-tone control without external labels; it optimizes a joint objective that includes CNF likelihood and perceptual, color, temporal, and physiological losses. Across UCLA-rPPG and MMPD, PhysFlow improves heart-rate estimation in darker skin tones and demonstrates compatibility with multiple rPPG models, contributing to more equitable performance in remote photoplethysmography.
Abstract
In recent years, deep learning methods have shown impressive results for camera-based remote physiological signal estimation, clearly surpassing traditional methods. However, the performance and generalization ability of Deep Neural Networks heavily depends on rich training data truly representing different factors of variation encountered in real applications. Unfortunately, many current remote photoplethysmography (rPPG) datasets lack diversity, particularly in darker skin tones, leading to biased performance of existing rPPG approaches. To mitigate this bias, we introduce PhysFlow, a novel method for augmenting skin diversity in remote heart rate estimation using conditional normalizing flows. PhysFlow adopts end-to-end training optimization, enabling simultaneous training of supervised rPPG approaches on both original and generated data. Additionally, we condition our model using CIELAB color space skin features directly extracted from the facial videos without the need for skin-tone labels. We validate PhysFlow on publicly available datasets, UCLA-rPPG and MMPD, demonstrating reduced heart rate error, particularly in dark skin tones. Furthermore, we demonstrate its versatility and adaptability across different data-driven rPPG methods.
