Table of Contents
Fetching ...

DySurv: dynamic deep learning model for survival analysis with conditional variational inference

Munib Mesinovic, Peter Watkinson, Tingting Zhu

TL;DR

A novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically and outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics.

Abstract

Machine learning applications for longitudinal electronic health records often forecast the risk of events at fixed time points, whereas survival analysis achieves dynamic risk prediction by estimating time-to-event distributions. Here, we propose a novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically. DySurv directly estimates the cumulative risk incidence function without making any parametric assumptions on the underlying stochastic process of the time-to-event. We evaluate DySurv on 6 time-to-event benchmark datasets in healthcare, as well as 2 real-world intensive care unit (ICU) electronic health records (EHR) datasets extracted from the eICU Collaborative Research (eICU) and the Medical Information Mart for Intensive Care database (MIMIC-IV). DySurv outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics. It achieves time-dependent concordance of over 60% in the eICU case. It is also over 12% more accurate and 22% more sensitive than in-use ICU scores like Acute Physiology and Chronic Health Evaluation (APACHE) and Sequential Organ Failure Assessment (SOFA) scores. The predictive capacity of DySurv is consistent and the survival estimates remain disentangled across different datasets. Our interdisciplinary framework successfully incorporates deep learning, survival analysis, and intensive care to create a novel method for time-to-event prediction from longitudinal health records. We test our method on several held-out test sets from a variety of healthcare datasets and compare it to existing in-use clinical risk scoring benchmarks.

DySurv: dynamic deep learning model for survival analysis with conditional variational inference

TL;DR

A novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically and outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics.

Abstract

Machine learning applications for longitudinal electronic health records often forecast the risk of events at fixed time points, whereas survival analysis achieves dynamic risk prediction by estimating time-to-event distributions. Here, we propose a novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically. DySurv directly estimates the cumulative risk incidence function without making any parametric assumptions on the underlying stochastic process of the time-to-event. We evaluate DySurv on 6 time-to-event benchmark datasets in healthcare, as well as 2 real-world intensive care unit (ICU) electronic health records (EHR) datasets extracted from the eICU Collaborative Research (eICU) and the Medical Information Mart for Intensive Care database (MIMIC-IV). DySurv outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics. It achieves time-dependent concordance of over 60% in the eICU case. It is also over 12% more accurate and 22% more sensitive than in-use ICU scores like Acute Physiology and Chronic Health Evaluation (APACHE) and Sequential Organ Failure Assessment (SOFA) scores. The predictive capacity of DySurv is consistent and the survival estimates remain disentangled across different datasets. Our interdisciplinary framework successfully incorporates deep learning, survival analysis, and intensive care to create a novel method for time-to-event prediction from longitudinal health records. We test our method on several held-out test sets from a variety of healthcare datasets and compare it to existing in-use clinical risk scoring benchmarks.
Paper Structure (18 sections, 13 equations, 4 figures, 4 tables)

This paper contains 18 sections, 13 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: (a) Description of the proposed DySurv framework using longitudinal EHR data for dynamic risk prediction instead of fixed-point event classification. The patient stay consists of measurements until $t_{j}$ as the last measurement recorded in the observation window. The orange marker states how much of the longitudinal data, i.e., how many timestamps are used in the model learning process. The red marker indicates how many timestamps into the future the model estimates the risk of death. In the first instance of classical machine learning, only the most recent timestamp measurement is used to predict the risk of death at a fixed time point in the future at a prediction window distance (except for Gaussian processes). For deep learning classification, by using LSTMs, we can learn from the entire patient longitudinal measurement but only estimate the risk of death at a fixed timepoint, hence the observation window is fully orange, but only one of the prediction window timestamps is red. DySurv uses all of the available longitudinal data to predict risk dynamically, ie. at all reasonable times into the feature.
  • Figure 2: Survival curves (estimate of survival probability over time) for benchmark datasets by DySurv across different samples. DySurv provides discrete estimates over time and additional interpolation was applied. Dots correspond to true event times which were predicted correctly by DySurv.
  • Figure 3: The set of survival curves for the MIMIC-IV ICU EHR dataset as generated by DySurv shows extrapolation of risk across different patients as compared to static and time-series feature sets. Dots correspond to true event times which were predicted correctly by DySurv.
  • Figure 4: Comparison of DySurv at 24-hour prediction with existing ICU survival scores and deep learning survival models on MIMIC-IV using (a) AUROC and (b) Sensitivity and c) AUPRC. APACHE IV was only able to be retrieved from MIMIC-IV and SOFA score was calculated from the eICU dataset.