Table of Contents
Fetching ...

A Multi-target Bayesian Transformer Framework for Predicting Cardiovascular Disease Biomarkers during Pandemics

Trusting Inekwe, Winnie Mkandawire, Emmanuel Agu, Andres Colubri

TL;DR

This work introduces MBT-CB, a Multi-target Bayesian Transformer for predicting four CVD biomarkers (LDL-C, HbA1c, BMI, SysBP) from longitudinal EHR data during pandemics. It fuses ClinicalBERT-based per-visit representations with a variational self-attention mechanism and a DeepMTR head to capture temporal dynamics, inter-biomarker dependencies, and predictive uncertainty. On 3,390 records from 304 patients in Central Massachusetts, MBT-CB achieves superior accuracy (MAE 0.00887, RMSE 0.0135, MSE 0.00027) compared with baselines, while providing uncertainty quantification and interpretable attention insights. The model's CPU-friendly inference and uncertainty-aware outputs support robust clinical decision-making in crisis settings, with future work aimed at broader generalization and explainability enhancements.

Abstract

The COVID-19 pandemic disrupted healthcare systems worldwide, disproportionately impacting individuals with chronic conditions such as cardiovascular disease (CVD). These disruptions -- through delayed care and behavioral changes, affected key CVD biomarkers, including LDL cholesterol (LDL-C), HbA1c, BMI, and systolic blood pressure (SysBP). Accurate modeling of these changes is crucial for predicting disease progression and guiding preventive care. However, prior work has not addressed multi-target prediction of CVD biomarker from Electronic Health Records (EHRs) using machine learning (ML), while jointly capturing biomarker interdependencies, temporal patterns, and predictive uncertainty. In this paper, we propose MBT-CB, a Multi-target Bayesian Transformer (MBT) with pre-trained BERT-based transformer framework to jointly predict LDL-C, HbA1c, BMI and SysBP CVD biomarkers from EHR data. The model leverages Bayesian Variational Inference to estimate uncertainties, embeddings to capture temporal relationships and a DeepMTR model to capture biomarker inter-relationships. We evaluate MBT-CT on retrospective EHR data from 3,390 CVD patient records (304 unique patients) in Central Massachusetts during the Covid-19 pandemic. MBT-CB outperformed a comprehensive set of baselines including other BERT-based ML models, achieving an MAE of 0.00887, RMSE of 0.0135 and MSE of 0.00027, while effectively capturing data and model uncertainty, patient biomarker inter-relationships, and temporal dynamics via its attention and embedding mechanisms. MBT-CB's superior performance highlights its potential to improve CVD biomarker prediction and support clinical decision-making during pandemics.

A Multi-target Bayesian Transformer Framework for Predicting Cardiovascular Disease Biomarkers during Pandemics

TL;DR

This work introduces MBT-CB, a Multi-target Bayesian Transformer for predicting four CVD biomarkers (LDL-C, HbA1c, BMI, SysBP) from longitudinal EHR data during pandemics. It fuses ClinicalBERT-based per-visit representations with a variational self-attention mechanism and a DeepMTR head to capture temporal dynamics, inter-biomarker dependencies, and predictive uncertainty. On 3,390 records from 304 patients in Central Massachusetts, MBT-CB achieves superior accuracy (MAE 0.00887, RMSE 0.0135, MSE 0.00027) compared with baselines, while providing uncertainty quantification and interpretable attention insights. The model's CPU-friendly inference and uncertainty-aware outputs support robust clinical decision-making in crisis settings, with future work aimed at broader generalization and explainability enhancements.

Abstract

The COVID-19 pandemic disrupted healthcare systems worldwide, disproportionately impacting individuals with chronic conditions such as cardiovascular disease (CVD). These disruptions -- through delayed care and behavioral changes, affected key CVD biomarkers, including LDL cholesterol (LDL-C), HbA1c, BMI, and systolic blood pressure (SysBP). Accurate modeling of these changes is crucial for predicting disease progression and guiding preventive care. However, prior work has not addressed multi-target prediction of CVD biomarker from Electronic Health Records (EHRs) using machine learning (ML), while jointly capturing biomarker interdependencies, temporal patterns, and predictive uncertainty. In this paper, we propose MBT-CB, a Multi-target Bayesian Transformer (MBT) with pre-trained BERT-based transformer framework to jointly predict LDL-C, HbA1c, BMI and SysBP CVD biomarkers from EHR data. The model leverages Bayesian Variational Inference to estimate uncertainties, embeddings to capture temporal relationships and a DeepMTR model to capture biomarker inter-relationships. We evaluate MBT-CT on retrospective EHR data from 3,390 CVD patient records (304 unique patients) in Central Massachusetts during the Covid-19 pandemic. MBT-CB outperformed a comprehensive set of baselines including other BERT-based ML models, achieving an MAE of 0.00887, RMSE of 0.0135 and MSE of 0.00027, while effectively capturing data and model uncertainty, patient biomarker inter-relationships, and temporal dynamics via its attention and embedding mechanisms. MBT-CB's superior performance highlights its potential to improve CVD biomarker prediction and support clinical decision-making during pandemics.

Paper Structure

This paper contains 22 sections, 8 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Heteroscedasticity in biomarkers: (a) HbA1c, (b) LDL-C.
  • Figure 2: Overview of Methodology
  • Figure 3: Visit frequency of patients
  • Figure 4: Our proposed MBT-CB framework based on a Transformer with Variational Self Attention. Patient's EHR biomarker values for $k$ visits (1 to $k$) are passed as input variables to the MBT-CB model. Prediction is on the $1^{st}$ visit, where $k < n$. DeepMTR image from reyes2019performing
  • Figure 5: Modification of a patient's EHR record. Each row represents a set of chronological biomarker values from an individual clinical visit, suitable for transformer-based modeling.
  • ...and 6 more figures