Table of Contents
Fetching ...

XAI for In-hospital Mortality Prediction via Multimodal ICU Data

Xingqiao Li, Jindong Gu, Zhiyong Wang, Yancheng Yuan, Bo Du, Fengxiang He

TL;DR

The paper tackles in-hospital mortality prediction in ICUs by integrating multimodal ICU data (discrete event sequences, clinical notes, and vital signs) through a transformer-based framework (X-MMP). It introduces LRPTrans, an explainable extension of Layer-Wise Relevance Propagation to transformers using Gradient × Input, enabling attribution across all modalities. Experiments on MIMIC-III-derived data show competitive predictive accuracy and robust, modality-balanced explanations, with ablation and perturbation studies confirming the value of multimodal fusion and the superiority of the proposed XAI approach. The approach yields interpretable insights for clinicians, identifying salient features like GCS, certain clinical terms, and SpO2 fluctuations, and demonstrates case-level explanations, suggesting practical utility and transferability to other clinical tasks.

Abstract

Predicting in-hospital mortality for intensive care unit (ICU) patients is key to final clinical outcomes. AI has shown advantaged accuracy but suffers from the lack of explainability. To address this issue, this paper proposes an eXplainable Multimodal Mortality Predictor (X-MMP) approaching an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data. We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions. Furthermore, we introduce an explainable method, namely Layer-Wise Propagation to Transformer, as a proper extension of the LRP method to Transformers, producing explanations over multimodal inputs and revealing the salient features attributed to prediction. Moreover, the contribution of each modality to clinical outcomes can be visualized, assisting clinicians in understanding the reasoning behind decision-making. We construct a multimodal dataset based on MIMIC-III and MIMIC-III Waveform Database Matched Subset. Comprehensive experiments on benchmark datasets demonstrate that our proposed framework can achieve reasonable interpretation with competitive prediction accuracy. In particular, our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.

XAI for In-hospital Mortality Prediction via Multimodal ICU Data

TL;DR

The paper tackles in-hospital mortality prediction in ICUs by integrating multimodal ICU data (discrete event sequences, clinical notes, and vital signs) through a transformer-based framework (X-MMP). It introduces LRPTrans, an explainable extension of Layer-Wise Relevance Propagation to transformers using Gradient × Input, enabling attribution across all modalities. Experiments on MIMIC-III-derived data show competitive predictive accuracy and robust, modality-balanced explanations, with ablation and perturbation studies confirming the value of multimodal fusion and the superiority of the proposed XAI approach. The approach yields interpretable insights for clinicians, identifying salient features like GCS, certain clinical terms, and SpO2 fluctuations, and demonstrates case-level explanations, suggesting practical utility and transferability to other clinical tasks.

Abstract

Predicting in-hospital mortality for intensive care unit (ICU) patients is key to final clinical outcomes. AI has shown advantaged accuracy but suffers from the lack of explainability. To address this issue, this paper proposes an eXplainable Multimodal Mortality Predictor (X-MMP) approaching an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data. We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions. Furthermore, we introduce an explainable method, namely Layer-Wise Propagation to Transformer, as a proper extension of the LRP method to Transformers, producing explanations over multimodal inputs and revealing the salient features attributed to prediction. Moreover, the contribution of each modality to clinical outcomes can be visualized, assisting clinicians in understanding the reasoning behind decision-making. We construct a multimodal dataset based on MIMIC-III and MIMIC-III Waveform Database Matched Subset. Comprehensive experiments on benchmark datasets demonstrate that our proposed framework can achieve reasonable interpretation with competitive prediction accuracy. In particular, our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
Paper Structure (31 sections, 7 equations, 5 figures, 4 tables)

This paper contains 31 sections, 7 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: An overview of multimodal ICU data. The discrete event sequences (top) are often recorded irregularly, and they contain categories and continuous features. The vital signs (middle) are multi-channel and high-density. They are often sampled every minute or second. The clinical notes (bottom) are written by the caregiver, and the format and content of notes are different for each patient. The cross is used to substitute the notes for privacy protection.
  • Figure 2: An overview of the X-MMP. The modeling of vital signs and discrete event sequences consists of sequence embedding (input embedding, position embedding), a stack of transformer encoding blocks, and pooler layers. The clinical notes model consists of word embedding from ClinicalBERT and a stack of transformer encoding blocks. The output representations from different modalities are concatenated and fed into a feedforward neural network (FNN). The final softmax layer is used for the prediction. The explanation is based on the backpropagation of the gradient. When the gradient back propagates, the attribution of nodes is calculated by their gradient and value (Gradient $\times$ Input). Finally, we get the attribution of multimodal inputs to the specified prediction. (a) Overview of X-MMP. (b) Structure of LRPTrans.
  • Figure 3: Evaluation of six explainable methods using input perturbations. The input features are sequentially removed based on the absolute attribution from small to large. (a) Discrete event sequences. (b) Vital signs. (c) Clinical notes.
  • Figure 4: The attribution of input features to the prediction of death in different modalities. A total of 485 in-hospital dead cases are used for analysis. (a) Top 5 positive attribution events in discrete event sequences. (b) Attribution of all "GCS total" scores. (c) Top 10 positive attribution tokens. (d) Top 10 negative attribution tokens. (e). Attribution of all vital signs.
  • Figure 5: The visualization of the most salient features in three different cases. The darker green color indicates that the features contribute positively to the prediction of death, and red indicates that the feature has a negative contribution. (a) Case with the most salient feature in discrete event sequences. (b) Case with the most salient feature in clinical notes. (c) Case with the most salient feature in vital signs.