SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion

Jun Wang; Yu Mao; Nan Guan; Chun Jason Xue

SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion

Jun Wang, Yu Mao, Nan Guan, Chun Jason Xue

TL;DR

The proposed SHAP-CAT framework incorporating synthetic modalities significantly enhances model performance, yielding a 5% increase in accuracy for the BCI, an 8% increase for IHC4BC-ER, and an 11% increase for the IHC4BC-PR dataset.

Abstract

The multimodal model has demonstrated promise in histopathology. However, most multimodal models are based on H\&E and genomics, adopting increasingly complex yet black-box designs. In our paper, we propose a novel interpretable multimodal framework named SHAP-CAT, which uses a Shapley-value-based dimension reduction technique for effective multimodal fusion. Starting with two paired modalities -- H\&E and IHC images, we employ virtual staining techniques to enhance limited input data by generating a new clinical-related modality. Lightweight bag-level representations are extracted from image modalities and a Shapley-value-based mechanism is used for dimension reduction. For each dimension of the bag-level representation, attribution values are calculated to indicate how changes in the specific dimensions of the input affect the model output. In this way, we select a few top important dimensions of bag-level representation for each image modality to late fusion. Our experimental results demonstrate that the proposed SHAP-CAT framework incorporating synthetic modalities significantly enhances model performance, yielding a 5\% increase in accuracy for the BCI, an 8\% increase for IHC4BC-ER, and an 11\% increase for the IHC4BC-PR dataset.

SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion

TL;DR

Abstract

Paper Structure (25 sections, 9 equations, 1 figure, 5 tables, 2 algorithms)

This paper contains 25 sections, 9 equations, 1 figure, 5 tables, 2 algorithms.

Introduction
Related work
Framework Design
Modality Generation.
Parallel Feature Extraction Pipeline.
SHAP-CAT Fusion Module.
Explainable Multi-Modal Fusion
Problem Formulation
Interpretability in Machine learning
Shapley Value of feature dimension
Fusion of modality
Generate low-dimension features by SHAP Pooling.
Kronecker product.
Experiments
Datasets and Implementation Details
...and 10 more sections

Figures (1)

Figure 1: Our proposed SHAP-CAT framework, which includes three Parallel Feature Extraction Pipelines for different modalities and a SHAP-CAT pipeline for multimodal representation predictions. (a) Generating a new modality by a pre-trained CycleGAN. (b) Extract bag-level representations for each modality from the Parallel Feature Extraction Pipeline and adopt the SHAP pool to reduce dimensions for further late fusion. (c) The descriptions of our key idea of how to select the top important dimensions for reduction. The x-axis represents the attribution value, the y-axis ranks features by the magnitude of absolute attributions, and the color indicates the feature value. It's important to note that the meaning of feature values are black-box and hard to interpret. By applying attribution values, the impact of features can be understood, and both positive attribution values and negative attribution values contribute to the output. (c) left shows the SHAP values of each dimension across all samples within a single class, while the right side shows the mean absolute value of the SHAP values for each dimension, broken down by class in multi-class tasks.

Theorems & Definitions (2)

Definition 1
Definition 2

SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion

TL;DR

Abstract

SHAP-CAT: A interpretable multi-modal framework enhancing WSI classification via virtual staining and shapley-value-based multimodal fusion

Authors

TL;DR

Abstract

Table of Contents

Figures (1)

Theorems & Definitions (2)