Zero Shot Health Trajectory Prediction Using Transformer

Pawel Renc; Yugang Jia; Anthony E. Samir; Jaroslaw Was; Quanzheng Li; David W. Bates; Arkadiusz Sitek

Zero Shot Health Trajectory Prediction Using Transformer

Pawel Renc, Yugang Jia, Anthony E. Samir, Jaroslaw Was, Quanzheng Li, David W. Bates, Arkadiusz Sitek

TL;DR

ETHOS presents a Transformer-based foundation model that operates on tokenized Patient Health Timelines (PHTs) to predict future health trajectories in a zero-shot setting, removing the need for task-specific labeled data or fine-tuning. Trained on the large MIMIC-IV EMR dataset with a 2048-token PHT context and a GPT-2–style decoder, ETHOS demonstrates robust zero-shot performance across mortality, LOS, readmission, SOFA estimation, and DRG classification, while offering the ability to generate multiple future timelines to quantify uncertainty. The work highlights a scalable pathway for healthcare AI by leveraging comprehensive tokenization and a single, adaptable model architecture, potentially reducing development costs and enabling rapid deployment across diverse tasks and data sources. Limitations include reliance on a single dataset, potential generalizability challenges, and substantial compute requirements, with future directions focusing on expanding data modalities, universal tokenization, and improved explainability and decision-support interfaces.

Abstract

Integrating modern machine learning and clinical decision-making has great promise for mitigating healthcare's increasing cost and complexity. We introduce the Enhanced Transformer for Health Outcome Simulation (ETHOS), a novel application of the transformer deep-learning architecture for analyzing high-dimensional, heterogeneous, and episodic health data. ETHOS is trained using Patient Health Timelines (PHTs)-detailed, tokenized records of health events-to predict future health trajectories, leveraging a zero-shot learning approach. ETHOS represents a significant advancement in foundation model development for healthcare analytics, eliminating the need for labeled data and model fine-tuning. Its ability to simulate various treatment pathways and consider patient-specific factors positions ETHOS as a tool for care optimization and addressing biases in healthcare delivery. Future developments will expand ETHOS' capabilities to incorporate a wider range of data types and data sources. Our work demonstrates a pathway toward accelerated AI development and deployment in healthcare.

Zero Shot Health Trajectory Prediction Using Transformer

TL;DR

Abstract

Paper Structure (16 sections, 12 figures, 2 tables)

This paper contains 16 sections, 12 figures, 2 tables.

Introduction
Results
Tokenization of MIMIC data and training of ETHOS
ETHOS inferences
Discussion
Methods
Data
Patient health timelines (PHTs), tokenization
ETHOS training
Evaluation of Clinical Outcomes and Tasks Using ETHOS
Statistical Analysis
Comparison of ETHOS to existing methods
Data Availability
Code Availability
Author Contribution
...and 1 more sections

Figures (12)

Figure 1: Implementing the ETHOS Model with EMR Data. (a) Extraction of raw patient data from the MIMIC-IV database, encompassing tables of admissions, patient demographics, medical procedures, among others. (b) The tokenization process, utilizing data from 90% of patients for model training and the remaining 10% for testing, transforms complex medical records into structured PHT for efficient model processing. (c) Training phase illustration, employing a transformer architecture optimized across 8 GPUs over a span of 36 hours. (d) Demonstration of ETHOS's zero-shot inference capabilities, highlighting its proficiency in performing tasks such as predicting inpatient mortality and readmission rates, leveraging forecasted future PHTs.
Figure 2: Tokenization and Embedding Visualizations of MIMIC-IV Data. (a) Overview of key insights derived from the tokenization process applied to MIMIC-IV data. (b) Visualization of embedding vectors for quantile tokens (Qs), which categorize quantitative information across the dataset. Each quantitative measure (e.g., blood pressure) is encoded by a preceding category-specific token followed by a quantile token, delineating its position within a predefined value range. This method facilitates a structured, scalable representation of complex data types via a systematic token sequence. (c) Visualization of embedding vectors for time-interval tokens, illustrating the temporal distribution and relationships within the PHT.
Figure 3: Receiver Operating Characteristic (ROC) Curves for Predictive Tasks via the ETHOS Model. Each graph delineates the model's efficacy in forecasting distinct clinical outcomes, specifically mortality and readmission rates. Accompanying each ROC curve are the case count (N), the outcome prevalence, and the 95% confidence interval for the AUC. Points marked with an 'X' denote specific thresholds utilized for classification decisions within the ETHOS model. Area under precision-recall (PR) curves is also provided and PR-curves are presented in supplementary material. The AUC of the existing study represents the performance of the best algorithms identified in the literature, with references provided within the text.
Figure 4: ETHOS Model Performance on SOFA Estimation and DRG Classification. (a) Estimation of the first-day Sequential Organ Failure Assessment (SOFA) score at ICU admission by ETHOS, which generates a sequence of three tokens: the admission type (orange token), a SOFA token (indicating the SOFA score estimation will follow), and a quantile token (q-token indicated by question mark) predicting probabilities of the SOFA score's quantile, as detailed at the bottom of the panel (a). The fixed position of the SOFA token ensures its consistent prediction immediately after ICU admission. The SOFA score is derived using quantile probabilities generated by ETHOS and average value of SOFA for ten quantiles (values of 1.0, 3.5 …). Since SOFA value 24 was not present in the dataset we predict values 0-23. (b) Correlation plot between actual and predicted SOFA scores. (c) For Diagnostic Related Groups (DRG) classification. The model is trained to insert a DRG token after tokens typically used at discharge time, utilizing a placeholder “DRG_UNKNOWN” for if DRG is unknown in the training set. Predicted probabilities are used to compute the top-1,2,3,5 DRG classifications. (d) Visualization of DRG classification accuracy, showcasing the model's predictive performance.
Figure 5: Stages of PHT Construction and Tokenization in ETHOS The process begins with assembling a chronological list of events from MIMIC-IV tables, Each entry on the list is time stamped with 64-bit real value only 6 significant digits show for clarity, indicating the patient's age at which the event occurred. Subsequently, list elements are transformed into tokens using ETHOS tokenization scheme. Based on the event's nature, one event can be translated into 1 up to 7 tokens. Each token derived from the same event shares its timestamp. The final step involves representing time gaps between events by inserting time-interval tokens. If the time difference between events is less than 5 minutes—the minimum value represented by the token for the shortest time interval—no token is added. After adding interval-tokens, timestamps are stripped from the timeline.
...and 7 more figures

Zero Shot Health Trajectory Prediction Using Transformer

TL;DR

Abstract

Zero Shot Health Trajectory Prediction Using Transformer

Authors

TL;DR

Abstract

Table of Contents

Figures (12)