Gaze into the Heart: A Multi-View Video Dataset for rPPG and Health Biomarkers Estimation
Konstantin Egorov, Stepan Botman, Pavel Blinov, Galina Zubkova, Anton Ivaschenko, Alexander Kolsanov, Andrey Savchenko
TL;DR
This work tackles the limitations of existing rPPG datasets—namely limited size, privacy concerns, and condition diversity—by introducing the large-scale, multi-view MCD-rPPG dataset, which aggregates 3600 synchronized video recordings from 600 subjects across resting and post-exercise states and aligns them with 100 Hz PPG and 13 health biomarkers. It also presents a fast multitask baseline model that leverages domain-specific ROI preprocessing and a lightweight 1D feature pyramid network to estimate PPG and multiple biomarkers in real time on CPU. The authors provide thorough cross-dataset benchmarking and multi-view analysis, demonstrating competitive accuracy and improved inference speed (up to 13% CPU speedup) compared with state-of-the-art methods. The public release of both the dataset (HuggingFace) and the code (GitHub) is intended to accelerate development of AI-driven medical assistants and telehealth applications that rely on video-based biomarker estimation.
Abstract
Progress in remote PhotoPlethysmoGraphy (rPPG) is limited by the critical issues of existing publicly available datasets: small size, privacy concerns with facial videos, and lack of diversity in conditions. The paper introduces a novel comprehensive large-scale multi-view video dataset for rPPG and health biomarkers estimation. Our dataset comprises 3600 synchronized video recordings from 600 subjects, captured under varied conditions (resting and post-exercise) using multiple consumer-grade cameras at different angles. To enable multimodal analysis of physiological states, each recording is paired with a 100 Hz PPG signal and extended health metrics, such as electrocardiogram, arterial blood pressure, biomarkers, temperature, oxygen saturation, respiratory rate, and stress level. Using this data, we train an efficient rPPG model and compare its quality with existing approaches in cross-dataset scenarios. The public release of our dataset and model should significantly speed up the progress in the development of AI medical assistants.
