Table of Contents
Fetching ...

MIMIC-Sepsis: A Curated Benchmark for Modeling and Learning from Sepsis Trajectories in the ICU

Yong Huang, Zhongqi Yang, Amir Rahmani

TL;DR

This work addresses the need for reproducible, up-to-date benchmarks in sepsis trajectory modeling by introducing MIMIC-Sepsis, a curated Sepsis-3 cohort derived from MIMIC-IV with time-aligned clinical variables and treatment data. It provides a transparent preprocessing pipeline, a clearly defined onset-aligned observation window, and four benchmark tasks (static mortality/LOS and dynamic vasopressor/shock predictions) to evaluate time-aware models. Empirical results show Transformer-based architectures outperform baselines, and incorporating treatment information substantially improves dynamic predictions, highlighting the importance of intervention data in modeling sepsis trajectories. As a publicly available resource, MIMIC-Sepsis enables robust evaluation, reproducibility, and future work in reinforcement learning and multimodal modeling for critical care sepsis management.

Abstract

Sepsis is a leading cause of mortality in intensive care units (ICUs), yet existing research often relies on outdated datasets, non-reproducible preprocessing pipelines, and limited coverage of clinical interventions. We introduce MIMIC-Sepsis, a curated cohort and benchmark framework derived from the MIMIC-IV database, designed to support reproducible modeling of sepsis trajectories. Our cohort includes 35,239 ICU patients with time-aligned clinical variables and standardized treatment data, including vasopressors, fluids, mechanical ventilation and antibiotics. We describe a transparent preprocessing pipeline-based on Sepsis-3 criteria, structured imputation strategies, and treatment inclusion-and release it alongside benchmark tasks focused on early mortality prediction, length-of-stay estimation, and shock onset classification. Empirical results demonstrate that incorporating treatment variables substantially improves model performance, particularly for Transformer-based architectures. MIMIC-Sepsis serves as a robust platform for evaluating predictive and sequential models in critical care research.

MIMIC-Sepsis: A Curated Benchmark for Modeling and Learning from Sepsis Trajectories in the ICU

TL;DR

This work addresses the need for reproducible, up-to-date benchmarks in sepsis trajectory modeling by introducing MIMIC-Sepsis, a curated Sepsis-3 cohort derived from MIMIC-IV with time-aligned clinical variables and treatment data. It provides a transparent preprocessing pipeline, a clearly defined onset-aligned observation window, and four benchmark tasks (static mortality/LOS and dynamic vasopressor/shock predictions) to evaluate time-aware models. Empirical results show Transformer-based architectures outperform baselines, and incorporating treatment information substantially improves dynamic predictions, highlighting the importance of intervention data in modeling sepsis trajectories. As a publicly available resource, MIMIC-Sepsis enables robust evaluation, reproducibility, and future work in reinforcement learning and multimodal modeling for critical care sepsis management.

Abstract

Sepsis is a leading cause of mortality in intensive care units (ICUs), yet existing research often relies on outdated datasets, non-reproducible preprocessing pipelines, and limited coverage of clinical interventions. We introduce MIMIC-Sepsis, a curated cohort and benchmark framework derived from the MIMIC-IV database, designed to support reproducible modeling of sepsis trajectories. Our cohort includes 35,239 ICU patients with time-aligned clinical variables and standardized treatment data, including vasopressors, fluids, mechanical ventilation and antibiotics. We describe a transparent preprocessing pipeline-based on Sepsis-3 criteria, structured imputation strategies, and treatment inclusion-and release it alongside benchmark tasks focused on early mortality prediction, length-of-stay estimation, and shock onset classification. Empirical results demonstrate that incorporating treatment variables substantially improves model performance, particularly for Transformer-based architectures. MIMIC-Sepsis serves as a robust platform for evaluating predictive and sequential models in critical care research.

Paper Structure

This paper contains 12 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Cohort selection and data processing workflow. This figure illustrates the high-level approach used to extract and process sepsis-related data from MIMIC-IV.
  • Figure 2: Key Clinical Parameters Across Organ Systems. Each panel shows the distribution of a key parameter using boxplots and violin plots. The correlation heatmap (bottom right) displays relationships between parameters.
  • Figure 3: Association between treatment interventions and hospital mortality in sepsis patients. The upper panels compare mortality rates between patients who did and did not receive antibiotics (left) and vasopressors (right). The lower panels analyze the impact of early ($\leq 6$h) versus late ($>6$h) administration timing. Statistical significance was assessed using chi-square tests, with p-values displayed for each comparison.