Table of Contents
Fetching ...

Towards Personalised Patient Risk Prediction Using Temporal Hospital Data Trajectories

Thea Barnes, Enrico Werner, Jeffrey N. Clark, Raul Santos-Rodriguez

TL;DR

The paper addresses the limitations of traditional Early Warning Scores by pursuing a personalised risk prediction approach based on temporal trajectories of vital signs in the ICU. It proposes a pipeline that converts longitudinal vitals into a similarity framework using Dynamic Time Warping, reduces dimensionality with UMAP, and clusters patients with HDBSCAN*, followed by cluster-specific in-hospital mortality prediction using Explainable Boosting Machines. On MIMIC-IV data from 6,000 ICU stays, six distinct patient subtypes emerge, and early data from the first four hours can assign most patients to the same subtype as the full stay. The results show higher F1 scores in five of six clusters compared with a non-clustered baseline, highlighting the potential for real-time, personalised deterioration risk assessment, while noting limitations and the need for external validation.

Abstract

Quantifying a patient's health status provides clinicians with insight into patient risk, and the ability to better triage and manage resources. Early Warning Scores (EWS) are widely deployed to measure overall health status, and risk of adverse outcomes, in hospital patients. However, current EWS are limited both by their lack of personalisation and use of static observations. We propose a pipeline that groups intensive care unit patients by the trajectories of observations data throughout their stay as a basis for the development of personalised risk predictions. Feature importance is considered to provide model explainability. Using the MIMIC-IV dataset, six clusters were identified, capturing differences in disease codes, observations, lengths of admissions and outcomes. Applying the pipeline to data from just the first four hours of each ICU stay assigns the majority of patients to the same cluster as when the entire stay duration is considered. In-hospital mortality prediction models trained on individual clusters had higher F1 score performance in five of the six clusters when compared against the unclustered patient cohort. The pipeline could form the basis of a clinical decision support tool, working to improve the clinical characterisation of risk groups and the early detection of patient deterioration.

Towards Personalised Patient Risk Prediction Using Temporal Hospital Data Trajectories

TL;DR

The paper addresses the limitations of traditional Early Warning Scores by pursuing a personalised risk prediction approach based on temporal trajectories of vital signs in the ICU. It proposes a pipeline that converts longitudinal vitals into a similarity framework using Dynamic Time Warping, reduces dimensionality with UMAP, and clusters patients with HDBSCAN*, followed by cluster-specific in-hospital mortality prediction using Explainable Boosting Machines. On MIMIC-IV data from 6,000 ICU stays, six distinct patient subtypes emerge, and early data from the first four hours can assign most patients to the same subtype as the full stay. The results show higher F1 scores in five of six clusters compared with a non-clustered baseline, highlighting the potential for real-time, personalised deterioration risk assessment, while noting limitations and the need for external validation.

Abstract

Quantifying a patient's health status provides clinicians with insight into patient risk, and the ability to better triage and manage resources. Early Warning Scores (EWS) are widely deployed to measure overall health status, and risk of adverse outcomes, in hospital patients. However, current EWS are limited both by their lack of personalisation and use of static observations. We propose a pipeline that groups intensive care unit patients by the trajectories of observations data throughout their stay as a basis for the development of personalised risk predictions. Feature importance is considered to provide model explainability. Using the MIMIC-IV dataset, six clusters were identified, capturing differences in disease codes, observations, lengths of admissions and outcomes. Applying the pipeline to data from just the first four hours of each ICU stay assigns the majority of patients to the same cluster as when the entire stay duration is considered. In-hospital mortality prediction models trained on individual clusters had higher F1 score performance in five of the six clusters when compared against the unclustered patient cohort. The pipeline could form the basis of a clinical decision support tool, working to improve the clinical characterisation of risk groups and the early detection of patient deterioration.
Paper Structure (9 sections, 4 figures, 2 tables)

This paper contains 9 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Procedure for clustering of patients based on similarities of their vital sign trajectories. For each feature, each patient's time-series is compared using dynamic time warping (DTW). DTW(i,j) represents the Dynamic Time Warping distance between trajectories of feature $k$ for patients $i$ and $j$. The DTW matrix for each feature is summed together, resulting in a matrix for the total DTW distance across all features between all patients. Dimensionality reduction (UMAP) and clustering (HDBSCAN) are then computed. grouping patients on the similarity of their feature trajectories.
  • Figure 2: Aggregated feature impact for in-hospital mortality prediction models when considering each patient's entire intensive care unit stay. Top and second ICD codes are the two most frequently occurring ICD codes throughout the entire ICU stay. bp = Blood pressure, SATS = Oxygen saturation, GCS = Glasgow Coma Scale.
  • Figure 3: Cluster evolution when considering data from varying lengths of time from first admission to ICU. Cluster labels in panel (a) correspond to cluster labels from the whole ICU stay (Figure \ref{['fig: pipeline overview']}).
  • Figure 4: Weighted average performance of in-hospital mortality models for varying time frames since admission. 'end of stay' includes the entire patient stay and utilises additional features not available for the shorter time frames. Error bars are standard deviations.