A Multi-target Bayesian Transformer Framework for Predicting Cardiovascular Disease Biomarkers during Pandemics
Trusting Inekwe, Winnie Mkandawire, Emmanuel Agu, Andres Colubri
TL;DR
This work introduces MBT-CB, a Multi-target Bayesian Transformer for predicting four CVD biomarkers (LDL-C, HbA1c, BMI, SysBP) from longitudinal EHR data during pandemics. It fuses ClinicalBERT-based per-visit representations with a variational self-attention mechanism and a DeepMTR head to capture temporal dynamics, inter-biomarker dependencies, and predictive uncertainty. On 3,390 records from 304 patients in Central Massachusetts, MBT-CB achieves superior accuracy (MAE 0.00887, RMSE 0.0135, MSE 0.00027) compared with baselines, while providing uncertainty quantification and interpretable attention insights. The model's CPU-friendly inference and uncertainty-aware outputs support robust clinical decision-making in crisis settings, with future work aimed at broader generalization and explainability enhancements.
Abstract
The COVID-19 pandemic disrupted healthcare systems worldwide, disproportionately impacting individuals with chronic conditions such as cardiovascular disease (CVD). These disruptions -- through delayed care and behavioral changes, affected key CVD biomarkers, including LDL cholesterol (LDL-C), HbA1c, BMI, and systolic blood pressure (SysBP). Accurate modeling of these changes is crucial for predicting disease progression and guiding preventive care. However, prior work has not addressed multi-target prediction of CVD biomarker from Electronic Health Records (EHRs) using machine learning (ML), while jointly capturing biomarker interdependencies, temporal patterns, and predictive uncertainty. In this paper, we propose MBT-CB, a Multi-target Bayesian Transformer (MBT) with pre-trained BERT-based transformer framework to jointly predict LDL-C, HbA1c, BMI and SysBP CVD biomarkers from EHR data. The model leverages Bayesian Variational Inference to estimate uncertainties, embeddings to capture temporal relationships and a DeepMTR model to capture biomarker inter-relationships. We evaluate MBT-CT on retrospective EHR data from 3,390 CVD patient records (304 unique patients) in Central Massachusetts during the Covid-19 pandemic. MBT-CB outperformed a comprehensive set of baselines including other BERT-based ML models, achieving an MAE of 0.00887, RMSE of 0.0135 and MSE of 0.00027, while effectively capturing data and model uncertainty, patient biomarker inter-relationships, and temporal dynamics via its attention and embedding mechanisms. MBT-CB's superior performance highlights its potential to improve CVD biomarker prediction and support clinical decision-making during pandemics.
