Diagnosis-based mortality prediction for intensive care unit patients via transfer learning
Mengqi Xu, Subha Maity, Joel Dubin
TL;DR
This study tackles the challenge of predicting ICU mortality across diagnostically heterogeneous patient groups. It introduces a two-step transfer learning framework for both GLM and XGBoost that leverages data-rich diagnoses to improve diagnosis-specific models, with per-diagnosis recalibration. Results show transfer learning consistently improves calibration and often discrimination over diagnosis-specific models and APACHE IVa baselines, while Youden-index thresholds provide better decision performance in low-prevalence settings. The approach demonstrates robust, threshold-agnostic gains across diagnoses and offers practical potential for improved ICU risk stratification and decision support, with simulated evidence supporting when gains are most likely to occur.
Abstract
In the intensive care unit, the underlying causes of critical illness vary substantially across diagnoses, yet prediction models accounting for diagnostic heterogeneity have not been systematically studied. To address the gap, we evaluate transfer learning approaches for diagnosis-specific mortality prediction and apply both GLM- and XGBoost-based models to the eICU Collaborative Research Database. Our results demonstrate that transfer learning consistently outperforms models trained only on diagnosis-specific data and those using a well-known ICU severity-of-illness score, i.e., APACHE IVa, alone, while also achieving better calibration than models trained on the pooled data. Our findings also suggest that the Youden cutoff is a more appropriate decision threshold than the conventional 0.5 for binary outcomes, and that transfer learning maintains consistently high predictive performance across various cutoff criteria.
