Lifestyle-Informed Personalized Blood Biomarker Prediction via Novel Representation Learning
A. Ali Heydari, Naghmeh Rezaei, Javier L. Prieto, Shwetak N. Patel, Ahmed A. Metwally
TL;DR
This paper tackles the challenge of personalizing blood biomarker references by incorporating lifestyle factors (physical activity and sleep) into a representation-learning framework. It introduces a novel deep metric learning approach with a regularized triplet loss to produce compact, clinically meaningful embeddings, which are then combined with current biomarker values to predict future biomarker levels from a single visit. Across the UK Biobank, the authors show that lifestyle differences meaningfully affect biomarker distributions and that their embeddings outperform traditional representations in downstream tasks, boosting future-value prediction accuracy, especially for metabolic biomarkers. The work points toward practical clinical benefits in early disease detection and tailored preventive care, while acknowledging limitations in dataset diversity and follow-up density and outlining plans to validate in more diverse populations.
Abstract
Blood biomarkers are an essential tool for healthcare providers to diagnose, monitor, and treat a wide range of medical conditions. Current reference values and recommended ranges often rely on population-level statistics, which may not adequately account for the influence of inter-individual variability driven by factors such as lifestyle and genetics. In this work, we introduce a novel framework for predicting future blood biomarker values and define personalized references through learned representations from lifestyle data (physical activity and sleep) and blood biomarkers. Our proposed method learns a similarity-based embedding space that captures the complex relationship between biomarkers and lifestyle factors. Using the UK Biobank (257K participants), our results show that our deep-learned embeddings outperform traditional and current state-of-the-art representation learning techniques in predicting clinical diagnosis. Using a subset of UK Biobank of 6440 participants who have follow-up visits, we validate that the inclusion of these embeddings and lifestyle factors directly in blood biomarker models improves the prediction of future lab values from a single lab visit. This personalized modeling approach provides a foundation for developing more accurate risk stratification tools and tailoring preventative care strategies. In clinical settings, this translates to the potential for earlier disease detection, more timely interventions, and ultimately, a shift towards personalized healthcare.
