Table of Contents
Fetching ...

Aortic Valve Disease Detection from PPG via Physiology-Informed Self-Supervised Learning

Jiaze Wang, Qinghao Zhao, Zizheng Chen, Zhejun Sun, Deyun Zhang, Yuxi Zhou, Shenda Hong

TL;DR

The paper tackles the challenge of large-scale, label-scarce screening for aortic valve disease by introducing Physiology-Guided Self-Supervised Learning (PG-SSL), which leverages 170k unlabeled PPG samples to learn physiology-informed waveforms. It formalizes four computable PPG biomarkers (Anacrotic, Pulsus Tardus, Water-Hammer Pulse, Normal) and pre-trains a Multi-Stream ResNet on a Pulse Pattern Recognition proxy task, followed by asymmetric, gated fusion fine-tuning on a small labeled dataset for AS and AR screening. The approach yields AUCs of $0.765$ (AS) and $0.776$ (AR), outperforms supervised baselines, and demonstrates independent prognostic value after adjusting for standard risk factors, suggesting PPG as a viable, low-cost tool for early, large-scale AVD screening. The work highlights the importance of domain-informed priors in medical AI under label scarcity and shows potential for wearable-based, opportunistic screening of valvular disease.

Abstract

Traditional diagnosis of aortic valve disease relies on echocardiography, but its cost and required expertise limit its use in large-scale early screening. Photoplethysmography (PPG) has emerged as a promising screening modality due to its widespread availability in wearable devices and its ability to reflect underlying hemodynamic dynamics. However, the extreme scarcity of gold-standard labeled PPG data severely constrains the effectiveness of data-driven approaches. To address this challenge, we propose and validate a new paradigm, Physiology-Guided Self-Supervised Learning (PG-SSL), aimed at unlocking the value of large-scale unlabeled PPG data for efficient screening of Aortic Stenosis (AS) and Aortic Regurgitation (AR). Using over 170,000 unlabeled PPG samples from the UK Biobank, we formalize clinical knowledge into a set of PPG morphological phenotypes and construct a pulse pattern recognition proxy task for self-supervised pre-training. A dual-branch, gated-fusion architecture is then employed for efficient fine-tuning on a small labeled subset. The proposed PG-SSL framework achieves AUCs of 0.765 and 0.776 for AS and AR screening, respectively, significantly outperforming supervised baselines trained on limited labeled data. Multivariable analysis further validates the model output as an independent digital biomarker with sustained prognostic value after adjustment for standard clinical risk factors. This study demonstrates that PG-SSL provides an effective, domain knowledge-driven solution to label scarcity in medical artificial intelligence and shows strong potential for enabling low-cost, large-scale early screening of aortic valve disease.

Aortic Valve Disease Detection from PPG via Physiology-Informed Self-Supervised Learning

TL;DR

The paper tackles the challenge of large-scale, label-scarce screening for aortic valve disease by introducing Physiology-Guided Self-Supervised Learning (PG-SSL), which leverages 170k unlabeled PPG samples to learn physiology-informed waveforms. It formalizes four computable PPG biomarkers (Anacrotic, Pulsus Tardus, Water-Hammer Pulse, Normal) and pre-trains a Multi-Stream ResNet on a Pulse Pattern Recognition proxy task, followed by asymmetric, gated fusion fine-tuning on a small labeled dataset for AS and AR screening. The approach yields AUCs of (AS) and (AR), outperforms supervised baselines, and demonstrates independent prognostic value after adjusting for standard risk factors, suggesting PPG as a viable, low-cost tool for early, large-scale AVD screening. The work highlights the importance of domain-informed priors in medical AI under label scarcity and shows potential for wearable-based, opportunistic screening of valvular disease.

Abstract

Traditional diagnosis of aortic valve disease relies on echocardiography, but its cost and required expertise limit its use in large-scale early screening. Photoplethysmography (PPG) has emerged as a promising screening modality due to its widespread availability in wearable devices and its ability to reflect underlying hemodynamic dynamics. However, the extreme scarcity of gold-standard labeled PPG data severely constrains the effectiveness of data-driven approaches. To address this challenge, we propose and validate a new paradigm, Physiology-Guided Self-Supervised Learning (PG-SSL), aimed at unlocking the value of large-scale unlabeled PPG data for efficient screening of Aortic Stenosis (AS) and Aortic Regurgitation (AR). Using over 170,000 unlabeled PPG samples from the UK Biobank, we formalize clinical knowledge into a set of PPG morphological phenotypes and construct a pulse pattern recognition proxy task for self-supervised pre-training. A dual-branch, gated-fusion architecture is then employed for efficient fine-tuning on a small labeled subset. The proposed PG-SSL framework achieves AUCs of 0.765 and 0.776 for AS and AR screening, respectively, significantly outperforming supervised baselines trained on limited labeled data. Multivariable analysis further validates the model output as an independent digital biomarker with sustained prognostic value after adjustment for standard clinical risk factors. This study demonstrates that PG-SSL provides an effective, domain knowledge-driven solution to label scarcity in medical artificial intelligence and shows strong potential for enabling low-cost, large-scale early screening of aortic valve disease.
Paper Structure (16 sections, 7 figures, 6 tables)

This paper contains 16 sections, 7 figures, 6 tables.

Figures (7)

  • Figure 1: Performance evaluation of the model on the independent test set. (A, B) ROC curves for AS and AR detection. The model achieves an AUC of 0.765 for AS and 0.776 for AR. (C) Calibration curves. The axes focus on the [0, 0.45] interval, showing high consistency between predicted probabilities and observed prevalence. (D) Screening enrichment factor (Lift curve). The curves, smoothed using a Savitzky-Golay filter, demonstrate significant disease prevalence enrichment in the top 5%-10% high-risk population (peaking at 4.68x for AS), verifying the utility of the model for optimizing medical resource allocation.
  • Figure 2: Grad-CAM visualization of model attention across different groups. (Left) Healthy controls show diffuse attention, verifying holistic waveform integrity. (Center) AS patients show focused attention on the delayed systolic upstroke and dicrotic notch. (Right) AR patients exhibit broad coverage over the high-amplitude systolic phase and early diastolic collapse, reflecting volume overload.
  • Figure 3: Temporal sensitivity analysis of the model. (A, B) ROC curves stratified by time-to-diagnosis intervals for AS and AR. (C, D) Trends of AUROC scores over time. The shaded areas represent 95% confidence intervals. AS shows a linear progression trend, while AR exhibits a plateau phase corresponding to physiological compensation.
  • Figure 4: Kaplan-Meier survival analysis in the PSM-matched cohort. (A, B) Fine-grained risk stratification by quartiles shows a clear dose-response relationship for both AS and AR. (C, D) Evaluation of a binary clinical screening strategy (Top 25% vs. Others) demonstrates significant separation in survival curves, validating the model's utility for early identification of high-risk populations.
  • Figure 5: Forest plots of subgroup analysis for (a) Aortic Stenosis and (b) Aortic Regurgitation. The model shows consistent performance trends across major physiological dimensions.
  • ...and 2 more figures