Table of Contents
Fetching ...

Diagnosis-based mortality prediction for intensive care unit patients via transfer learning

Mengqi Xu, Subha Maity, Joel Dubin

TL;DR

This study tackles the challenge of predicting ICU mortality across diagnostically heterogeneous patient groups. It introduces a two-step transfer learning framework for both GLM and XGBoost that leverages data-rich diagnoses to improve diagnosis-specific models, with per-diagnosis recalibration. Results show transfer learning consistently improves calibration and often discrimination over diagnosis-specific models and APACHE IVa baselines, while Youden-index thresholds provide better decision performance in low-prevalence settings. The approach demonstrates robust, threshold-agnostic gains across diagnoses and offers practical potential for improved ICU risk stratification and decision support, with simulated evidence supporting when gains are most likely to occur.

Abstract

In the intensive care unit, the underlying causes of critical illness vary substantially across diagnoses, yet prediction models accounting for diagnostic heterogeneity have not been systematically studied. To address the gap, we evaluate transfer learning approaches for diagnosis-specific mortality prediction and apply both GLM- and XGBoost-based models to the eICU Collaborative Research Database. Our results demonstrate that transfer learning consistently outperforms models trained only on diagnosis-specific data and those using a well-known ICU severity-of-illness score, i.e., APACHE IVa, alone, while also achieving better calibration than models trained on the pooled data. Our findings also suggest that the Youden cutoff is a more appropriate decision threshold than the conventional 0.5 for binary outcomes, and that transfer learning maintains consistently high predictive performance across various cutoff criteria.

Diagnosis-based mortality prediction for intensive care unit patients via transfer learning

TL;DR

This study tackles the challenge of predicting ICU mortality across diagnostically heterogeneous patient groups. It introduces a two-step transfer learning framework for both GLM and XGBoost that leverages data-rich diagnoses to improve diagnosis-specific models, with per-diagnosis recalibration. Results show transfer learning consistently improves calibration and often discrimination over diagnosis-specific models and APACHE IVa baselines, while Youden-index thresholds provide better decision performance in low-prevalence settings. The approach demonstrates robust, threshold-agnostic gains across diagnoses and offers practical potential for improved ICU risk stratification and decision support, with simulated evidence supporting when gains are most likely to occur.

Abstract

In the intensive care unit, the underlying causes of critical illness vary substantially across diagnoses, yet prediction models accounting for diagnostic heterogeneity have not been systematically studied. To address the gap, we evaluate transfer learning approaches for diagnosis-specific mortality prediction and apply both GLM- and XGBoost-based models to the eICU Collaborative Research Database. Our results demonstrate that transfer learning consistently outperforms models trained only on diagnosis-specific data and those using a well-known ICU severity-of-illness score, i.e., APACHE IVa, alone, while also achieving better calibration than models trained on the pooled data. Our findings also suggest that the Youden cutoff is a more appropriate decision threshold than the conventional 0.5 for binary outcomes, and that transfer learning maintains consistently high predictive performance across various cutoff criteria.

Paper Structure

This paper contains 22 sections, 14 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Flowchart of cohort selection from eICU-CRD and data preparation process.
  • Figure 2: GLM Discrimination and calibration performance for source and transfer learning models. Metric values are obtained from all test folds. X-axis represents diagnoses.
  • Figure 3: XGBoost discrimination and calibration performance for source and transfer learning models. Metric values are obtained from all test folds. X-axis represents diagnoses.
  • Figure 4: Youden cutoff (boxes) v.s. prevalence (red horizontal lines) as decision thresholds. The Youden cutoff is calculated from different training folds across diagnoses and methods. The prevalence is the mortality rate per diagnosis. X-axis represents different diagnoses, and y-axis represents decision threshold values.
  • Figure 5: Simulation results of AUROC and AUPRC, comparing true models, source models and transfer learning models for XGBoost and GLM under four scenarios. $b$ is signal strength, representing discrimination level. X-axis represents different methods.