HgbNet: predicting hemoglobin level/anemia degree from EHR data
Zhuo Zhi, Moe Elbadawi, Adam Daneshmend, Mine Orlu, Abdul Basit, Andreas Demosthenous, Miguel Rodrigues
TL;DR
HgbNet tackles non-invasive hemoglobin and anemia prediction from irregular multi-variate EHR time series by combining a NanDense layer for missing data, a time-embedding strategy for local irregularity, and three attention mechanisms to capture local and global irregularities. The architecture, including a specialized LSTM-M backbone and a downstream MLP, processes four inputs per visit to output Hb levels and anemia degrees, trained with RMSE/MAE/R^2 and weighted classification metrics. Evaluations on MIMIC III and eICU across two use cases show HgbNet consistently outperforming state-of-the-art baselines and demonstrate robustness to irregular time gaps, with further gains when incorporating non-invasive measurements at the target time. The work establishes the feasibility and potential clinical impact of EHR-based, non-invasive Hb/anemia prediction, and points to future directions in attention analysis and sensor-enabled extensions.
Abstract
Anemia is a prevalent medical condition that typically requires invasive blood tests for diagnosis and monitoring. Electronic health records (EHRs) have emerged as valuable data sources for numerous medical studies. EHR-based hemoglobin level/anemia degree prediction is non-invasive and rapid but still faces some challenges due to the fact that EHR data is typically an irregular multivariate time series containing a significant number of missing values and irregular time intervals. To address these issues, we introduce HgbNet, a machine learning-based prediction model that emulates clinicians' decision-making processes for hemoglobin level/anemia degree prediction. The model incorporates a NanDense layer with a missing indicator to handle missing values and employs attention mechanisms to account for both local irregularity and global irregularity. We evaluate the proposed method using two real-world datasets across two use cases. In our first use case, we predict hemoglobin level/anemia degree at moment T+1 by utilizing records from moments prior to T+1. In our second use case, we integrate all historical records with additional selected test results at moment T+1 to predict hemoglobin level/anemia degree at the same moment, T+1. HgbNet outperforms the best baseline results across all datasets and use cases. These findings demonstrate the feasibility of estimating hemoglobin levels and anemia degree from EHR data, positioning HgbNet as an effective non-invasive anemia diagnosis solution that could potentially enhance the quality of life for millions of affected individuals worldwide. To our knowledge, HgbNet is the first machine learning model leveraging EHR data for hemoglobin level/anemia degree prediction.
