Explainable Machine Learning for Sepsis Outcome Prediction Using a Novel Romanian Electronic Health Record Dataset

Andrei-Alexandru Bunea; Ovidiu Ghibea; Dan-Matei Popovici; Ion Daniel; Octavian Andronic

Explainable Machine Learning for Sepsis Outcome Prediction Using a Novel Romanian Electronic Health Record Dataset

Andrei-Alexandru Bunea, Ovidiu Ghibea, Dan-Matei Popovici, Ion Daniel, Octavian Andronic

Abstract

We develop and analyze explainable machine learning (ML) models for sepsis outcome prediction using a novel Electronic Health Record (EHR) dataset from 12,286 hospitalizations at a large emergency hospital in Romania. The dataset includes demographics, International Classification of Diseases (ICD-10) diagnostics, and 600 types of laboratory tests. This study aims to identify clinically strong predictors while achieving state-of-the-art results across three classification tasks: (1)deceased vs. discharged, (2)deceased vs. recovered, and (3)recovered vs. ameliorated. We trained five ML models to capture complex distributions while preserving clinical interpretability. Experiments explored the trade-off between feature richness and patient coverage, using subsets of the 10--50 most frequent laboratory tests. Model performance was evaluated using accuracy and area under the curve (AUC), and explainability was assessed using SHapley Additive exPlanations (SHAP). The highest performance was obtained for the deceased vs. recovered case study (AUC=0.983, accuracy=0.93). SHAP analysis identified several strong predictors such as cardiovascular comorbidities, urea levels, aspartate aminotransferase, platelet count, and eosinophil percentage. Eosinopenia emerged as a top predictor, highlighting its value as an underutilized marker that is not included in current assessment standards, while the high performance suggests the applicability of these models in clinical settings.

Explainable Machine Learning for Sepsis Outcome Prediction Using a Novel Romanian Electronic Health Record Dataset

Abstract

Explainable Machine Learning for Sepsis Outcome Prediction Using a Novel Romanian Electronic Health Record Dataset

Abstract

Paper Structure

Table of Contents

Figures (11)