Table of Contents
Fetching ...

Improving Representation Learning of Complex Critical Care Data with ICU-BERT

Ricardo Santos, André V. Carreiro, Xi Peng, Hugo Gamboa, Holger Fröhlich

TL;DR

ICU-BERT introduces a transformer-based representation learning framework for ICU data that treats multivariate, asynchronous records as quadruplet tokens and leverages BioBERT embeddings and temporal encodings. A novel pre-training objective, Masked Language-Value Modelling, along with a multi-task loss, enables robust learning from diverse data streams with minimal preprocessing. In experiments across MIMIC-IV and YAIB-processed datasets, ICU-BERT achieves competitive ICU mortality and other clinically relevant tasks, with fine-tuning substantially improving external generalization, while highlighting challenges in long-sequence modeling and benchmarking. The approach advances foundational medical AI by integrating structured and unstructured data through a scalable, generalizable representation suitable for broad clinical decision support, while outlining avenues for multimodal extensions and longer-sequence architectures.

Abstract

The multivariate, asynchronous nature of real-world clinical data, such as that generated in Intensive Care Units (ICUs), challenges traditional AI-based decision-support systems. These often assume data regularity and feature independence and frequently rely on limited data scopes and manual feature engineering. The potential of generative AI technologies has not yet been fully exploited to analyze clinical data. We introduce ICU-BERT, a transformer-based model pre-trained on the MIMIC-IV database using a multi-task scheme to learn robust representations of complex ICU data with minimal preprocessing. ICU-BERT employs a multi-token input strategy, incorporating dense embeddings from a biomedical Large Language Model to learn a generalizable representation of complex and multivariate ICU data. With an initial evaluation of five tasks and four additional ICU datasets, ICU-BERT results indicate that ICU-BERT either compares to or surpasses current performance benchmarks by leveraging fine-tuning. By integrating structured and unstructured data, ICU-BERT advances the use of foundational models in medical informatics, offering an adaptable solution for clinical decision support across diverse applications.

Improving Representation Learning of Complex Critical Care Data with ICU-BERT

TL;DR

ICU-BERT introduces a transformer-based representation learning framework for ICU data that treats multivariate, asynchronous records as quadruplet tokens and leverages BioBERT embeddings and temporal encodings. A novel pre-training objective, Masked Language-Value Modelling, along with a multi-task loss, enables robust learning from diverse data streams with minimal preprocessing. In experiments across MIMIC-IV and YAIB-processed datasets, ICU-BERT achieves competitive ICU mortality and other clinically relevant tasks, with fine-tuning substantially improving external generalization, while highlighting challenges in long-sequence modeling and benchmarking. The approach advances foundational medical AI by integrating structured and unstructured data through a scalable, generalizable representation suitable for broad clinical decision support, while outlining avenues for multimodal extensions and longer-sequence architectures.

Abstract

The multivariate, asynchronous nature of real-world clinical data, such as that generated in Intensive Care Units (ICUs), challenges traditional AI-based decision-support systems. These often assume data regularity and feature independence and frequently rely on limited data scopes and manual feature engineering. The potential of generative AI technologies has not yet been fully exploited to analyze clinical data. We introduce ICU-BERT, a transformer-based model pre-trained on the MIMIC-IV database using a multi-task scheme to learn robust representations of complex ICU data with minimal preprocessing. ICU-BERT employs a multi-token input strategy, incorporating dense embeddings from a biomedical Large Language Model to learn a generalizable representation of complex and multivariate ICU data. With an initial evaluation of five tasks and four additional ICU datasets, ICU-BERT results indicate that ICU-BERT either compares to or surpasses current performance benchmarks by leveraging fine-tuning. By integrating structured and unstructured data, ICU-BERT advances the use of foundational models in medical informatics, offering an adaptable solution for clinical decision support across diverse applications.

Paper Structure

This paper contains 21 sections, 11 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: ICU-BERT scheme. Complex, multivariate, and sparse medical registries $r_i$ are processed by a multi-token embedding structure that combines embeddings from feature names $f$, categorical or numerical values $x$, timestamps $\tau$, and durations $\delta$. Pre-trained embeddings enhance the representations of features and categorical values, and a novel pre-training multi-task loss optimizes the simultaneous reconstruction of both features and values.
  • Figure 2: Results for ICU mortality as mean and standard deviation over 5-fold CV, on external evaluation and fine-tuning, in MIMIC-IV and the YAIB-processed datasets.