Table of Contents
Fetching ...

Benchmarking Open-Source PPG Foundation Models for Biological Age Prediction

N. Brag

Abstract

A task-specific model trained on 212,231 UK Biobank subjects to predict vascular age from PPG (AI-PPG Age) fails on a different clinical population: predictions collapse to a narrow 38-67 year range regardless of true age. Meanwhile, a general-purpose foundation model with no age-related training objective achieves lower error on the same data. We investigate why this happens and what it means for PPG-based biological age prediction. We evaluate three open-source PPG models (Pulse-PPG, PaPaGei-S, AI-PPG Age) on 906 surgical patients from PulseDB, using frozen embeddings with Ridge regression and 5-fold cross-validation. Pulse-PPG reaches MAE = 9.28 years, beating both AI-PPG Age in linear probe mode (9.72) and HR/HRV combined with demographics (9.59). Adding demographic features brings the best result down to MAE = 8.22 years (R2 = 0.517, r = 0.725). The predicted age gap correlates with diastolic blood pressure after adjusting for chronological age (r = -0.188, p = 1.2e-8), consistent with what Apple reported for their proprietary PpgAge model. The remaining gap with Apple (MAE 2.43) appears driven by dataset size (906 vs 213,593 subjects) and population differences rather than model architecture, as our learning curve shows no plateau. Code is publicly available.

Benchmarking Open-Source PPG Foundation Models for Biological Age Prediction

Abstract

A task-specific model trained on 212,231 UK Biobank subjects to predict vascular age from PPG (AI-PPG Age) fails on a different clinical population: predictions collapse to a narrow 38-67 year range regardless of true age. Meanwhile, a general-purpose foundation model with no age-related training objective achieves lower error on the same data. We investigate why this happens and what it means for PPG-based biological age prediction. We evaluate three open-source PPG models (Pulse-PPG, PaPaGei-S, AI-PPG Age) on 906 surgical patients from PulseDB, using frozen embeddings with Ridge regression and 5-fold cross-validation. Pulse-PPG reaches MAE = 9.28 years, beating both AI-PPG Age in linear probe mode (9.72) and HR/HRV combined with demographics (9.59). Adding demographic features brings the best result down to MAE = 8.22 years (R2 = 0.517, r = 0.725). The predicted age gap correlates with diastolic blood pressure after adjusting for chronological age (r = -0.188, p = 1.2e-8), consistent with what Apple reported for their proprietary PpgAge model. The remaining gap with Apple (MAE 2.43) appears driven by dataset size (906 vs 213,593 subjects) and population differences rather than model architecture, as our learning curve shows no plateau. Code is publicly available.
Paper Structure (22 sections, 3 figures, 3 tables)

This paper contains 22 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Predicted vs. chronological age for the two contrasting models ($n$ = 906 subjects). Left: AI-PPG Age (zero-shot), a task-specific model trained on 212,231 UK Biobank subjects, produces predictions compressed to a narrow range (38--67 years) regardless of true age --- a population shift failure. Right: Pulse-PPG + Demographics, a general-purpose foundation model with no age-related training objective, achieves MAE = 8.22 years with predictions well distributed along the diagonal.
  • Figure 2: Learning curve: subject-level MAE vs training set size for Pulse-PPG and PaPaGei-S (5-fold CV, averaged across folds).
  • Figure 3: Correlation of PPG age gap (Pulse-PPG + Demographics) with cardiovascular markers ($n$ = 906). Each panel shows the scatter plot with both raw and age-adjusted (partial) correlations. Only the DBP partial correlation survives Bonferroni correction.