Table of Contents
Fetching ...

R2I-rPPG: A Robust Region of Interest Selection Method for Remote Photoplethysmography to Extract Heart Rate

Sandeep Nagar, Mark Hasegawa-Johnson, David G. Beiser, Narendra Ahuja

TL;DR

The effectiveness of the proposed approach in improving the accuracy and robustness of rPPG in a challenging clinical environment and the robustness of this ROI selection method when coupled to the Plane-Orthogonal-to-Skin (POS) rPPG method when applied to videos of patients presenting to an Emergency Department for respiratory complaints are demonstrated.

Abstract

The COVID-19 pandemic has underscored the need for low-cost, scalable approaches to measuring contactless vital signs, either during initial triage at a healthcare facility or virtual telemedicine visits. Remote photoplethysmography (rPPG) can accurately estimate heart rate (HR) when applied to close-up videos of healthy volunteers in well-lit laboratory settings. However, results from such highly optimized laboratory studies may not be readily translated to healthcare settings. One significant barrier to the practical application of rPPG in health care is the accurate localization of the region of interest (ROI). Clinical or telemedicine visits may involve sub-optimal lighting, movement artifacts, variable camera angle, and subject distance. This paper presents an rPPG ROI selection method based on 3D facial landmarks and patient head yaw angle. We then demonstrate the robustness of this ROI selection method when coupled to the Plane-Orthogonal-to-Skin (POS) rPPG method when applied to videos of patients presenting to an Emergency Department for respiratory complaints. Our results demonstrate the effectiveness of our proposed approach in improving the accuracy and robustness of rPPG in a challenging clinical environment.

R2I-rPPG: A Robust Region of Interest Selection Method for Remote Photoplethysmography to Extract Heart Rate

TL;DR

The effectiveness of the proposed approach in improving the accuracy and robustness of rPPG in a challenging clinical environment and the robustness of this ROI selection method when coupled to the Plane-Orthogonal-to-Skin (POS) rPPG method when applied to videos of patients presenting to an Emergency Department for respiratory complaints are demonstrated.

Abstract

The COVID-19 pandemic has underscored the need for low-cost, scalable approaches to measuring contactless vital signs, either during initial triage at a healthcare facility or virtual telemedicine visits. Remote photoplethysmography (rPPG) can accurately estimate heart rate (HR) when applied to close-up videos of healthy volunteers in well-lit laboratory settings. However, results from such highly optimized laboratory studies may not be readily translated to healthcare settings. One significant barrier to the practical application of rPPG in health care is the accurate localization of the region of interest (ROI). Clinical or telemedicine visits may involve sub-optimal lighting, movement artifacts, variable camera angle, and subject distance. This paper presents an rPPG ROI selection method based on 3D facial landmarks and patient head yaw angle. We then demonstrate the robustness of this ROI selection method when coupled to the Plane-Orthogonal-to-Skin (POS) rPPG method when applied to videos of patients presenting to an Emergency Department for respiratory complaints. Our results demonstrate the effectiveness of our proposed approach in improving the accuracy and robustness of rPPG in a challenging clinical environment.

Paper Structure

This paper contains 20 sections, 3 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of R2I-rPPG for real-time heart rate extraction: (a) input video (b) face detection with 3D landmark localization, (c) ROI definition using landmarks, (d) temporal color averaging over ROIs, (e) POS algorithm application for raw HR signal extraction, and heart rate calculation via interbeat analysis. (fps=frames per second, $ext_{HR}$= Extracted HR, $GT_{HR}$= ground truth HR)
  • Figure 2: (a) 3D Face mesh: 468 3D-landmarks (using MediaPipe). (b) Three ROI from 1. forehead center, 2. left cheek, and 3. right cheek (each $40\times40$ centered on respective 3D landmark). ROI's size in pixels, 40x40, is a hyper-parameter and can be set manually based on video's frame size.
  • Figure 3: Out of three identifiable ROIs (forehead, right cheek, and left cheek), most appropriate and visible ROI for raw HR signal extraction is selected based on yaw angle.
  • Figure 4: Signal filtering and spectrum analysis. Left: Sequential filtering stages showing (a) raw HR signal, (b) ASF filtered signal, (c) CDF filtered signal, and (d) moving average filtered signal demonstrating noise reduction. Right: Power spectrum estimation using (e) Welch's method, (f) CSD, and (g) interbeat analysis for frequency analysis. PSD: power spectral density.
  • Figure 5: Recording setup comparison: (left) conventional public datasets with controlled settings vs. (right) our unrestricted emergency ward setup allowing natural patient movement and variable camera positions.
  • ...and 2 more figures