Table of Contents
Fetching ...

Benchmarking with MIMIC-IV, an irregular, spare clinical time series dataset

Hung Bui, Harikrishna Warrier, Yogesh Gupta

TL;DR

This work benchmarks irregular, sparse clinical time-series from MIMIC-IV on two tasks—in-ICU mortality and ICU length-of-stay—using a standardized data pipeline adapted from Gupta et al. It compares XGBoost, LSTM, and TCN under 5-fold cross-validation, finding XGBoost to be the strongest performer for both tasks. The study highlights the value of standardized benchmarking for MIMIC-IV and suggests expanding the suite with additional models and tasks to enhance clinical predictive research. Overall, it reinforces that robust baselines and consistent evaluation are crucial for progress in time-series EHR modeling.

Abstract

Electronic health record (EHR) is more and more popular, and it comes with applying machine learning solutions to resolve various problems in the domain. This growing research area also raises the need for EHRs accessibility. Medical Information Mart for Intensive Care (MIMIC) dataset is a popular, public, and free EHR dataset in a raw format that has been used in numerous studies. However, despite of its popularity, it is lacking benchmarking work, especially with recent state of the art works in the field of deep learning with time-series tabular data. The aim of this work is to fill this lack by providing a benchmark for latest version of MIMIC dataset, MIMIC-IV. We also give a detailed literature survey about studies that has been already done for MIIMIC-III.

Benchmarking with MIMIC-IV, an irregular, spare clinical time series dataset

TL;DR

This work benchmarks irregular, sparse clinical time-series from MIMIC-IV on two tasks—in-ICU mortality and ICU length-of-stay—using a standardized data pipeline adapted from Gupta et al. It compares XGBoost, LSTM, and TCN under 5-fold cross-validation, finding XGBoost to be the strongest performer for both tasks. The study highlights the value of standardized benchmarking for MIMIC-IV and suggests expanding the suite with additional models and tasks to enhance clinical predictive research. Overall, it reinforces that robust baselines and consistent evaluation are crucial for progress in time-series EHR modeling.

Abstract

Electronic health record (EHR) is more and more popular, and it comes with applying machine learning solutions to resolve various problems in the domain. This growing research area also raises the need for EHRs accessibility. Medical Information Mart for Intensive Care (MIMIC) dataset is a popular, public, and free EHR dataset in a raw format that has been used in numerous studies. However, despite of its popularity, it is lacking benchmarking work, especially with recent state of the art works in the field of deep learning with time-series tabular data. The aim of this work is to fill this lack by providing a benchmark for latest version of MIMIC dataset, MIMIC-IV. We also give a detailed literature survey about studies that has been already done for MIIMIC-III.
Paper Structure (8 sections, 1 figure, 3 tables)

This paper contains 8 sections, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Pipeline Overview from gupta2022extensive