Camera Measurement of Blood Oxygen Saturation
Jiankai Tang, Xin Liu, Daniel McDuff, Zhang Jiang, Hongming Hu, Luxi Zhou, Nodoka Nagao, Haruta Suzuki, Yuki Nagahama, Wei Li, Linhong Ji, Yuanchun Shi, Izumi Nishidate, Yuntao Wang
TL;DR
This work tackles non-invasive $SpO_2$ estimation from video by introducing a deep learning framework that fuses facial video frames with color-calibration cues. The VC2S model, combined with careful calibration strategies and ROI-based preprocessing, achieves robust intra- and inter-dataset performance across diverse skin tones, ages, and environments, outperforming traditional signal-processing baselines. Key contributions include comprehensive ablation analyses, demonstration of the importance of calibration (including a color checker) for generalization, and a detailed methodology for dataset collection and processing. The approach holds promise for scalable remote health monitoring, while highlighting the ongoing challenge of calibration requirements for real-world deployment across populations and devices.
Abstract
Blood oxygen saturation (SpO2) is a crucial vital sign routinely monitored in medical settings. Traditional methods require dedicated contact sensors, limiting accessibility and comfort. This study presents a deep learning framework for contactless SpO2 measurement using an off-the-shelf camera, addressing challenges related to lighting variations and skin tone diversity. We conducted two large-scale studies with diverse participants and evaluated our method against traditional signal processing approaches in intra- and inter-dataset scenarios. Our approach demonstrated consistent accuracy across demographic groups, highlighting the feasibility of camera-based SpO2 monitoring as a scalable and non-invasive tool for remote health assessment.
