Table of Contents
Fetching ...

Feature importance to explain multimodal prediction models. A clinical use case

Jorn-Jan van de Beld, Shreyasi Pathak, Jeroen Geerdink, Johannes H. Hegeman, Christin Seifert

TL;DR

The paper addresses predicting 30-day mortality after hip fracture surgery by leveraging a multimodal deep-learning framework that integrates pre-operative static data, hip and chest images, and per-operative vital signs and medications. It introduces SHAP-based explanations and a novel Shapley value propagation method to provide local and global attributions across a sequence of unimodal models fused into a multimodal predictor. The main finding is that pre-operative data, especially static features, carry most predictive power, while per-operative data yield limited gains; nonetheless, SHAP-based explanations enable interpretable, modality- and feature-level insights for clinical decision-making. The work demonstrates the feasibility of explainable multimodal predictions in a clinical setting and highlights future directions for more robust fusion strategies and handling missing modalities to enhance generalizability and trustworthiness.

Abstract

Surgery to treat elderly hip fracture patients may cause complications that can lead to early mortality. An early warning system for complications could provoke clinicians to monitor high-risk patients more carefully and address potential complications early, or inform the patient. In this work, we develop a multimodal deep-learning model for post-operative mortality prediction using pre-operative and per-operative data from elderly hip fracture patients. Specifically, we include static patient data, hip and chest images before surgery in pre-operative data, vital signals, and medications administered during surgery in per-operative data. We extract features from image modalities using ResNet and from vital signals using LSTM. Explainable model outcomes are essential for clinical applicability, therefore we compute Shapley values to explain the predictions of our multimodal black box model. We find that i) Shapley values can be used to estimate the relative contribution of each modality both locally and globally, and ii) a modified version of the chain rule can be used to propagate Shapley values through a sequence of models supporting interpretable local explanations. Our findings imply that a multimodal combination of black box models can be explained by propagating Shapley values through the model sequence.

Feature importance to explain multimodal prediction models. A clinical use case

TL;DR

The paper addresses predicting 30-day mortality after hip fracture surgery by leveraging a multimodal deep-learning framework that integrates pre-operative static data, hip and chest images, and per-operative vital signs and medications. It introduces SHAP-based explanations and a novel Shapley value propagation method to provide local and global attributions across a sequence of unimodal models fused into a multimodal predictor. The main finding is that pre-operative data, especially static features, carry most predictive power, while per-operative data yield limited gains; nonetheless, SHAP-based explanations enable interpretable, modality- and feature-level insights for clinical decision-making. The work demonstrates the feasibility of explainable multimodal predictions in a clinical setting and highlights future directions for more robust fusion strategies and handling missing modalities to enhance generalizability and trustworthiness.

Abstract

Surgery to treat elderly hip fracture patients may cause complications that can lead to early mortality. An early warning system for complications could provoke clinicians to monitor high-risk patients more carefully and address potential complications early, or inform the patient. In this work, we develop a multimodal deep-learning model for post-operative mortality prediction using pre-operative and per-operative data from elderly hip fracture patients. Specifically, we include static patient data, hip and chest images before surgery in pre-operative data, vital signals, and medications administered during surgery in per-operative data. We extract features from image modalities using ResNet and from vital signals using LSTM. Explainable model outcomes are essential for clinical applicability, therefore we compute Shapley values to explain the predictions of our multimodal black box model. We find that i) Shapley values can be used to estimate the relative contribution of each modality both locally and globally, and ii) a modified version of the chain rule can be used to propagate Shapley values through a sequence of models supporting interpretable local explanations. Our findings imply that a multimodal combination of black box models can be explained by propagating Shapley values through the model sequence.
Paper Structure (22 sections, 5 equations, 2 figures, 7 tables)

This paper contains 22 sections, 5 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Overview of how the unimodal models are fused to form the multimodal models. Dimensions at the input and feature extraction layers are shown in brackets, where the sequence length (seq_len) for the vitals varies between patients. No feature transformation was performed for the per-operative medication data. Data types: tabular/structured (S), image (I), time-series (T).
  • Figure 2: Shapley plots for single cases in the test set. First, the contribution of each modality is shown, followed by the contribution of individual static features.