Table of Contents
Fetching ...

Gaze into the Heart: A Multi-View Video Dataset for rPPG and Health Biomarkers Estimation

Konstantin Egorov, Stepan Botman, Pavel Blinov, Galina Zubkova, Anton Ivaschenko, Alexander Kolsanov, Andrey Savchenko

TL;DR

This work tackles the limitations of existing rPPG datasets—namely limited size, privacy concerns, and condition diversity—by introducing the large-scale, multi-view MCD-rPPG dataset, which aggregates 3600 synchronized video recordings from 600 subjects across resting and post-exercise states and aligns them with 100 Hz PPG and 13 health biomarkers. It also presents a fast multitask baseline model that leverages domain-specific ROI preprocessing and a lightweight 1D feature pyramid network to estimate PPG and multiple biomarkers in real time on CPU. The authors provide thorough cross-dataset benchmarking and multi-view analysis, demonstrating competitive accuracy and improved inference speed (up to 13% CPU speedup) compared with state-of-the-art methods. The public release of both the dataset (HuggingFace) and the code (GitHub) is intended to accelerate development of AI-driven medical assistants and telehealth applications that rely on video-based biomarker estimation.

Abstract

Progress in remote PhotoPlethysmoGraphy (rPPG) is limited by the critical issues of existing publicly available datasets: small size, privacy concerns with facial videos, and lack of diversity in conditions. The paper introduces a novel comprehensive large-scale multi-view video dataset for rPPG and health biomarkers estimation. Our dataset comprises 3600 synchronized video recordings from 600 subjects, captured under varied conditions (resting and post-exercise) using multiple consumer-grade cameras at different angles. To enable multimodal analysis of physiological states, each recording is paired with a 100 Hz PPG signal and extended health metrics, such as electrocardiogram, arterial blood pressure, biomarkers, temperature, oxygen saturation, respiratory rate, and stress level. Using this data, we train an efficient rPPG model and compare its quality with existing approaches in cross-dataset scenarios. The public release of our dataset and model should significantly speed up the progress in the development of AI medical assistants.

Gaze into the Heart: A Multi-View Video Dataset for rPPG and Health Biomarkers Estimation

TL;DR

This work tackles the limitations of existing rPPG datasets—namely limited size, privacy concerns, and condition diversity—by introducing the large-scale, multi-view MCD-rPPG dataset, which aggregates 3600 synchronized video recordings from 600 subjects across resting and post-exercise states and aligns them with 100 Hz PPG and 13 health biomarkers. It also presents a fast multitask baseline model that leverages domain-specific ROI preprocessing and a lightweight 1D feature pyramid network to estimate PPG and multiple biomarkers in real time on CPU. The authors provide thorough cross-dataset benchmarking and multi-view analysis, demonstrating competitive accuracy and improved inference speed (up to 13% CPU speedup) compared with state-of-the-art methods. The public release of both the dataset (HuggingFace) and the code (GitHub) is intended to accelerate development of AI-driven medical assistants and telehealth applications that rely on video-based biomarker estimation.

Abstract

Progress in remote PhotoPlethysmoGraphy (rPPG) is limited by the critical issues of existing publicly available datasets: small size, privacy concerns with facial videos, and lack of diversity in conditions. The paper introduces a novel comprehensive large-scale multi-view video dataset for rPPG and health biomarkers estimation. Our dataset comprises 3600 synchronized video recordings from 600 subjects, captured under varied conditions (resting and post-exercise) using multiple consumer-grade cameras at different angles. To enable multimodal analysis of physiological states, each recording is paired with a 100 Hz PPG signal and extended health metrics, such as electrocardiogram, arterial blood pressure, biomarkers, temperature, oxygen saturation, respiratory rate, and stress level. Using this data, we train an efficient rPPG model and compare its quality with existing approaches in cross-dataset scenarios. The public release of our dataset and model should significantly speed up the progress in the development of AI medical assistants.

Paper Structure

This paper contains 9 sections, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Value distribution of diastolic blood pressure (a), systolic blood pressure (b), pulse (c) and respiratory rate (d), before and after physical activity.
  • Figure 2: Distribution of record time shifts (between frame timestamps and physical clock) estimated using KDE ($\textrm{cam}_1$ is IriunWebcam, $\textrm{cam}_2$ is FullHDwebcam and $\textrm{cam}_3$ is USBVideo).
  • Figure 3: Distribution of time shift between different video sources ($\textrm{cam}_1$ is IriunWebcam, $\textrm{cam}_2$ is FullHDwebcam and $\textrm{cam}_3$ is USBVideo).
  • Figure 4: Distribution of time shift between ground truth PPG and reconstructed PPG ($\textrm{cam}_1$ is IriunWebcam, $\textrm{cam}_2$ is FullHDwebcam and $\textrm{cam}_3$ is USBVideo).
  • Figure 5: Overview of our baseline model.