Camera-Based Remote Physiology Sensing for Hundreds of Subjects Across Skin Tones
Jiankai Tang, Xinyi Li, Jiacheng Liu, Xiyuxing Zhang, Zeyu Wang, Yuntao Wang
TL;DR
This work addresses the lack of large-scale, diverse datasets for camera-based rPPG by analyzing the VitalVideo collection, the largest real-world rPPG dataset with 893 subjects across six Fitzpatrick skin tones. It validates six unsupervised and three supervised methods under cross-dataset evaluations, revealing that effective training can be achieved with a few hundred subjects and that skin-tone consistency critically affects performance, especially for darker tones. The study provides concrete benchmarks, open-source code, and guidance on data volume and diversity requirements, highlighting that diverse yet balanced data improves robustness and fair assessment across datasets. Practically, these findings inform dataset design, evaluation practices, and model development for equitable remote physiology sensing in real-world settings, enabling more reliable health monitoring across diverse populations.
Abstract
Remote photoplethysmography (rPPG) emerges as a promising method for non-invasive, convenient measurement of vital signs, utilizing the widespread presence of cameras. Despite advancements, existing datasets fall short in terms of size and diversity, limiting comprehensive evaluation under diverse conditions. This paper presents an in-depth analysis of the VitalVideo dataset, the largest real-world rPPG dataset to date, encompassing 893 subjects and 6 Fitzpatrick skin tones. Our experimentation with six unsupervised methods and three supervised models demonstrates that datasets comprising a few hundred subjects(i.e., 300 for UBFC-rPPG, 500 for PURE, and 700 for MMPD-Simple) are sufficient for effective rPPG model training. Our findings highlight the importance of diversity and consistency in skin tones for precise performance evaluation across different datasets.
