Table of Contents
Fetching ...

AnyPPG: An ECG-Guided PPG Foundation Model Trained on Over 100,000 Hours of Recordings for Holistic Health Profiling

Guangkun Nie, Xiaocheng Fang, Gongzheng Tang, Yujie Xiao, Jun Li, Bo Liu, Hongyan Li, Shenda Hong

Abstract

Photoplethysmography (PPG) is widely used as a non-invasive and accessible modality for continuous health monitoring. However, despite being a peripheral hemodynamic signal intrinsically coupled with systemic circulation, existing research has largely confined its scope to a narrow range of cardiovascular tasks, leaving a fundamental question underexplored: to what extent can PPG support holistic health profiling beyond traditional cardiovascular applications? To answer this question, we present AnyPPG, a foundation model-based framework designed to reveal the broader health-profiling potential of PPG. To ensure reliable performance for this investigation, AnyPPG is pretrained with ECG guidance on the most diverse PPG corpus with synchronized ECG to date, comprising over 100,000 hours of recordings from six large-scale data sources. This pretraining yields robust and physiologically grounded PPG representations that provide a reliable basis for subsequent analysis. Building upon this pretrained model, we conduct a systematic investigation into the association between PPG and holistic health through, to our knowledge, the first PPG-based phenome-wide disease detection study, spanning 1,468 disease phenotypes in more than 15,000 subjects. Our evaluation demonstrates the effectiveness of AnyPPG: across eight clinical and wearable datasets covering 15 downstream tasks, it achieves the best performance in 13 tasks. More importantly, in the phenome-wide analysis, AnyPPG exhibits meaningful discriminative capability (AUC $\ge$ 0.70) for 307 phenotypes across 16 distinct phecode chapters, including 230 non-circulatory conditions such as dementia and chronic kidney disease, many of which have rarely been explored using PPG. Collectively, these findings indicate that easily acquired PPG signals encode rich health-related information extending well beyond conventional cardiovascular assessment.

AnyPPG: An ECG-Guided PPG Foundation Model Trained on Over 100,000 Hours of Recordings for Holistic Health Profiling

Abstract

Photoplethysmography (PPG) is widely used as a non-invasive and accessible modality for continuous health monitoring. However, despite being a peripheral hemodynamic signal intrinsically coupled with systemic circulation, existing research has largely confined its scope to a narrow range of cardiovascular tasks, leaving a fundamental question underexplored: to what extent can PPG support holistic health profiling beyond traditional cardiovascular applications? To answer this question, we present AnyPPG, a foundation model-based framework designed to reveal the broader health-profiling potential of PPG. To ensure reliable performance for this investigation, AnyPPG is pretrained with ECG guidance on the most diverse PPG corpus with synchronized ECG to date, comprising over 100,000 hours of recordings from six large-scale data sources. This pretraining yields robust and physiologically grounded PPG representations that provide a reliable basis for subsequent analysis. Building upon this pretrained model, we conduct a systematic investigation into the association between PPG and holistic health through, to our knowledge, the first PPG-based phenome-wide disease detection study, spanning 1,468 disease phenotypes in more than 15,000 subjects. Our evaluation demonstrates the effectiveness of AnyPPG: across eight clinical and wearable datasets covering 15 downstream tasks, it achieves the best performance in 13 tasks. More importantly, in the phenome-wide analysis, AnyPPG exhibits meaningful discriminative capability (AUC 0.70) for 307 phenotypes across 16 distinct phecode chapters, including 230 non-circulatory conditions such as dementia and chronic kidney disease, many of which have rarely been explored using PPG. Collectively, these findings indicate that easily acquired PPG signals encode rich health-related information extending well beyond conventional cardiovascular assessment.

Paper Structure

This paper contains 46 sections, 5 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Research question and study overview of AnyPPG. (a) While PPG has been extensively studied for cardiovascular applications, its potential to support holistic health profiling across multi-organ conditions remains unclear. (b) To answer this question, we present AnyPPG, a foundation model-based framework designed to investigate the holistic health-profiling potential of PPG. AnyPPG leverages ECG-guided pretraining on the most diverse PPG corpus with synchronized ECG recordings to date, enabling the learning of physiologically grounded PPG representations and achieving SOTA performance on conventional PPG analysis tasks. Building upon this foundation, it further reveals broad associations between PPG and holistic health through a phenome-wide disease detection study spanning 1,468 phenotypes.
  • Figure 2: ECG-guided cross-modal pretraining of AnyPPG. AnyPPG learns robust and physiologically grounded PPG representations via contrastive alignment guided by ECGFounder li2025electrocardiogram, an ECG foundation model, in a shared latent space, aligning synchronized PPG-ECG pairs while separating mismatched ones.
  • Figure 3: Phenome-wide disease detection performance of AnyPPG. Left: Distribution of AUC scores for 1,468 disease phenotypes. Middle: Chapter-level AUC performance distribution. Right: Count of high-performing phenotypes (AUC $\ge$ 0.70) across different phecode chapters.
  • Figure A1: The top 60 disease phenotypes with the highest discriminative performance, ranked by AUC.