Table of Contents
Fetching ...

A Large-scale Multimodal Study for Predicting Mortality Risk Using Minimal and Low Parameter Models and Separable Risk Assessment

Alvaro E. Ulloa Cerna, Marios Pattichis, David P. vanMaanen, Linyuan Jing, Aalpen A. Patel, Joshua V. Stough, Christopher M. Haggerty, Brandon K. Fornwalt

TL;DR

This work addresses short-term mortality prediction using massive, heterogeneous clinical data by introducing a separable risk framework and a family of low-parameter multimodal architectures that fuse EHR, echocardiography videos, and ECG traces without heavy transfer learning. The authors demonstrate that a full EHR+Echo+ECG model achieves about 0.89–0.90 AUC, outperforming single-modality baselines and offering clear per-feature risk contributions through an interpretable fusion mechanism. They also present two minimal, home-monitorable variants with AUCs around 0.78–0.80 and show that non-linear, cubic feature transformations are essential for capturing risk patterns. The study provides a practical path toward interpretable, scalable mortality risk assessment and releases the DISIML package to facilitate replication and adoption in clinical research.

Abstract

The majority of biomedical studies use limited datasets that may not generalize over large heterogeneous datasets that have been collected over several decades. The current paper develops and validates several multimodal models that can predict 1-year mortality based on a massive clinical dataset. Our focus on predicting 1-year mortality can provide a sense of urgency to the patients. Using the largest dataset of its kind, the paper considers the development and validation of multimodal models based on 25,137,015 videos associated with 699,822 echocardiography studies from 316,125 patients, and 2,922,990 8-lead electrocardiogram (ECG) traces from 631,353 patients. Our models allow us to assess the contribution of individual factors and modalities to the overall risk. Our approach allows us to develop extremely low-parameter models that use optimized feature selection based on feature importance. Based on available clinical information, we construct a family of models that are made available in the DISIML package. Overall, performance ranges from an AUC of 0.72 with just ten parameters to an AUC of 0.89 with under 105k for the full multimodal model. The proposed approach represents a modular neural network framework that can provide insights into global risk trends and guide therapies for reducing mortality risk.

A Large-scale Multimodal Study for Predicting Mortality Risk Using Minimal and Low Parameter Models and Separable Risk Assessment

TL;DR

This work addresses short-term mortality prediction using massive, heterogeneous clinical data by introducing a separable risk framework and a family of low-parameter multimodal architectures that fuse EHR, echocardiography videos, and ECG traces without heavy transfer learning. The authors demonstrate that a full EHR+Echo+ECG model achieves about 0.89–0.90 AUC, outperforming single-modality baselines and offering clear per-feature risk contributions through an interpretable fusion mechanism. They also present two minimal, home-monitorable variants with AUCs around 0.78–0.80 and show that non-linear, cubic feature transformations are essential for capturing risk patterns. The study provides a practical path toward interpretable, scalable mortality risk assessment and releases the DISIML package to facilitate replication and adoption in clinical research.

Abstract

The majority of biomedical studies use limited datasets that may not generalize over large heterogeneous datasets that have been collected over several decades. The current paper develops and validates several multimodal models that can predict 1-year mortality based on a massive clinical dataset. Our focus on predicting 1-year mortality can provide a sense of urgency to the patients. Using the largest dataset of its kind, the paper considers the development and validation of multimodal models based on 25,137,015 videos associated with 699,822 echocardiography studies from 316,125 patients, and 2,922,990 8-lead electrocardiogram (ECG) traces from 631,353 patients. Our models allow us to assess the contribution of individual factors and modalities to the overall risk. Our approach allows us to develop extremely low-parameter models that use optimized feature selection based on feature importance. Based on available clinical information, we construct a family of models that are made available in the DISIML package. Overall, performance ranges from an AUC of 0.72 with just ten parameters to an AUC of 0.89 with under 105k for the full multimodal model. The proposed approach represents a modular neural network framework that can provide insights into global risk trends and guide therapies for reducing mortality risk.

Paper Structure

This paper contains 15 sections, 11 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Multimodal system diagram. The system uses echocardiography ('Echo') videos from 6 views (top-left), ECG traces from multiple channels, Echo and ECG measurements and findings, and basic clinical data (age, sex, vitals, and labs). The Echo and ECG measurements and findings are supplied by the physician. 3D CNN models are used for processing the videos. A multichannel 1D CNN model is used for training on the ECG traces. The Separable Neural Network system (SNN) is used to process the contributions from the different inputs. The echocardiography videos yield a total of six features (red lines), the ECG traces yield one feature (blue line), and the structured data contribute 168 features (black line).
  • Figure 2: Separable risk functions for the 10 most significant clinical, non-binary features. The risk functions are shown in blue. The normalized histograms of the survivors are shown in light blue. The normalized histograms of the non-survivors appear in orange. When the two histograms overlap, the histograms appear light brown. Risk functions for: (a) age in years, (b) lymphocytes in percent, (c) lactate dehydrogenase in units per liter, (d) bilirubin in milligrams per deciliter, (e) hemoglobin in grams per deciliter, (f) blood urea nitrogen in milligrams per deciliter, (g) tricuspid regurgitation maximum velocity in centimeters per second, (h) left ventricular ejection fraction in percent, (i) BMI in kilograms per meter squared, and (j) systolic blood pressure in millimeters of mercury. For each risk function, the weight for each feature is given in the title of the graph. If the coefficient is negative, the risk function is corrected to normalize the graph to higher values for higher risk.