Table of Contents
Fetching ...

Prototypical Information Bottlenecking and Disentangling for Multimodal Cancer Survival Prediction

Yilan Zhang, Yingxue Xu, Jianqi Chen, Fengying Xie, Hao Chen

TL;DR

Multimodal cancer survival prediction faces intra-modal redundancy from noisy, high-dimensional data and inter-modal redundancy from overlapping information across modalities. The authors introduce PIBD, a framework combining Prototypical Information Bottleneck (PIB) for intra-modal redundancy reduction and Prototypical Information Disentanglement (PID) for inter-modal disentanglement into modality-common and modality-specific components. PIB uses prototypes to represent risk-specific latent distributions and select discriminative instances within bags, while PID leverages joint prototypes to guide a disentangled transformer that isolates common and specific information across pathology and genomics. Across five TCGA cancer datasets, PIBD achieves state-of-the-art concordance indices, with ablations confirming the distinct contributions of PIB and PID and visualizations supporting interpretability of prototype-driven selections.

Abstract

Multimodal learning significantly benefits cancer survival prediction, especially the integration of pathological images and genomic data. Despite advantages of multimodal learning for cancer survival prediction, massive redundancy in multimodal data prevents it from extracting discriminative and compact information: (1) An extensive amount of intra-modal task-unrelated information blurs discriminability, especially for gigapixel whole slide images (WSIs) with many patches in pathology and thousands of pathways in genomic data, leading to an ``intra-modal redundancy" issue. (2) Duplicated information among modalities dominates the representation of multimodal data, which makes modality-specific information prone to being ignored, resulting in an ``inter-modal redundancy" issue. To address these, we propose a new framework, Prototypical Information Bottlenecking and Disentangling (PIBD), consisting of Prototypical Information Bottleneck (PIB) module for intra-modal redundancy and Prototypical Information Disentanglement (PID) module for inter-modal redundancy. Specifically, a variant of information bottleneck, PIB, is proposed to model prototypes approximating a bunch of instances for different risk levels, which can be used for selection of discriminative instances within modality. PID module decouples entangled multimodal data into compact distinct components: modality-common and modality-specific knowledge, under the guidance of the joint prototypical distribution. Extensive experiments on five cancer benchmark datasets demonstrated our superiority over other methods.

Prototypical Information Bottlenecking and Disentangling for Multimodal Cancer Survival Prediction

TL;DR

Multimodal cancer survival prediction faces intra-modal redundancy from noisy, high-dimensional data and inter-modal redundancy from overlapping information across modalities. The authors introduce PIBD, a framework combining Prototypical Information Bottleneck (PIB) for intra-modal redundancy reduction and Prototypical Information Disentanglement (PID) for inter-modal disentanglement into modality-common and modality-specific components. PIB uses prototypes to represent risk-specific latent distributions and select discriminative instances within bags, while PID leverages joint prototypes to guide a disentangled transformer that isolates common and specific information across pathology and genomics. Across five TCGA cancer datasets, PIBD achieves state-of-the-art concordance indices, with ablations confirming the distinct contributions of PIB and PID and visualizations supporting interpretability of prototype-driven selections.

Abstract

Multimodal learning significantly benefits cancer survival prediction, especially the integration of pathological images and genomic data. Despite advantages of multimodal learning for cancer survival prediction, massive redundancy in multimodal data prevents it from extracting discriminative and compact information: (1) An extensive amount of intra-modal task-unrelated information blurs discriminability, especially for gigapixel whole slide images (WSIs) with many patches in pathology and thousands of pathways in genomic data, leading to an ``intra-modal redundancy" issue. (2) Duplicated information among modalities dominates the representation of multimodal data, which makes modality-specific information prone to being ignored, resulting in an ``inter-modal redundancy" issue. To address these, we propose a new framework, Prototypical Information Bottlenecking and Disentangling (PIBD), consisting of Prototypical Information Bottleneck (PIB) module for intra-modal redundancy and Prototypical Information Disentanglement (PID) module for inter-modal redundancy. Specifically, a variant of information bottleneck, PIB, is proposed to model prototypes approximating a bunch of instances for different risk levels, which can be used for selection of discriminative instances within modality. PID module decouples entangled multimodal data into compact distinct components: modality-common and modality-specific knowledge, under the guidance of the joint prototypical distribution. Extensive experiments on five cancer benchmark datasets demonstrated our superiority over other methods.
Paper Structure (27 sections, 34 equations, 8 figures, 5 tables)

This paper contains 27 sections, 34 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Framework of PIBD. Patient data from pathology and genomics are initially structured into bags. The Prototypical Information Bottleneck (PIB) selects discriminative features to reduce "intra-modal redundancy". Subsequently, the Prototypical Information Disentanglement (PID) module decouples the specific and common information to tackle "inter-modal redundancy".
  • Figure 2: Disentangled Transformer. The self-attention is employed to model the intra-modal interactions while a token sampled from the joint prototypical distribution is used to guide common information extraction through cross-attention.
  • Figure 3: Kaplan-Meier curves of predicted high-risk (red) and low-risk (green) groups. A P-value $< 0.05$ indicates statistical significance, and the shaded regions represent the confident intervals. The median survival months are reported in the format of "high-risk: mean(std)/low-risk: mean(std)"
  • Figure 3: Interventions in PIB. We conduct interventions by either removing the positive prototype or randomly deleting one of the negative prototypes.
  • Figure 4: Visualization of prototypes.
  • ...and 3 more figures