Table of Contents
Fetching ...

Interpretable Multimodal Cancer Prototyping with Whole Slide Images and Incompletely Paired Genomics

Yupei Zhang, Yating Huang, Wanming Hu, Lequan Yu, Hujun Yin, Chao Li

TL;DR

This work tackles the challenge of integrating WSIs and incompletely paired genomics for cancer precision medicine by introducing a biologically grounded prototyping framework. It jointly learns intra-modal representations through biologically informed prototypes, aligns modalities via distribution- and sample-wise strategies, performs selective cross-modal fusion with a bipartite scheme, and robustly imputes missing genomics with a semantic sgI module. The approach yields state-of-the-art results on glioma diagnosis, grading, and survival while offering interpretable insights through prototype importance and cross-modal interactions. The findings suggest strong potential for clinically robust deployment in settings with incomplete molecular data, with avenues for extending to other modalities and multi-institutional datasets.

Abstract

Multimodal approaches that integrate histology and genomics hold strong potential for precision oncology. However, phenotypic and genotypic heterogeneity limits the quality of intra-modal representations and hinders effective inter-modal integration. Furthermore, most existing methods overlook real-world clinical scenarios where genomics may be partially missing or entirely unavailable. We propose a flexible multimodal prototyping framework to integrate whole slide images and incomplete genomics for precision oncology. Our approach has four key components: 1) Biological Prototyping using text prompting and prototype-wise weighting; 2) Multiview Alignment through sample- and distribution-wise alignments; 3) Bipartite Fusion to capture both shared and modality-specific information for multimodal fusion; and 4) Semantic Genomics Imputation to handle missing data. Extensive experiments demonstrate the consistent superiority of the proposed method compared to other state-of-the-art approaches on multiple downstream tasks. The code is available at https://github.com/helenypzhang/Interpretable-Multimodal-Prototyping.

Interpretable Multimodal Cancer Prototyping with Whole Slide Images and Incompletely Paired Genomics

TL;DR

This work tackles the challenge of integrating WSIs and incompletely paired genomics for cancer precision medicine by introducing a biologically grounded prototyping framework. It jointly learns intra-modal representations through biologically informed prototypes, aligns modalities via distribution- and sample-wise strategies, performs selective cross-modal fusion with a bipartite scheme, and robustly imputes missing genomics with a semantic sgI module. The approach yields state-of-the-art results on glioma diagnosis, grading, and survival while offering interpretable insights through prototype importance and cross-modal interactions. The findings suggest strong potential for clinically robust deployment in settings with incomplete molecular data, with avenues for extending to other modalities and multi-institutional datasets.

Abstract

Multimodal approaches that integrate histology and genomics hold strong potential for precision oncology. However, phenotypic and genotypic heterogeneity limits the quality of intra-modal representations and hinders effective inter-modal integration. Furthermore, most existing methods overlook real-world clinical scenarios where genomics may be partially missing or entirely unavailable. We propose a flexible multimodal prototyping framework to integrate whole slide images and incomplete genomics for precision oncology. Our approach has four key components: 1) Biological Prototyping using text prompting and prototype-wise weighting; 2) Multiview Alignment through sample- and distribution-wise alignments; 3) Bipartite Fusion to capture both shared and modality-specific information for multimodal fusion; and 4) Semantic Genomics Imputation to handle missing data. Extensive experiments demonstrate the consistent superiority of the proposed method compared to other state-of-the-art approaches on multiple downstream tasks. The code is available at https://github.com/helenypzhang/Interpretable-Multimodal-Prototyping.

Paper Structure

This paper contains 36 sections, 13 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: An overview of the proposed framework. (a) Biological Prototyping (BP): Extracts biologically meaningful prototypes from WSI and genomic data through class-specific promptings and a dynamic importance weighting module, enhancing feature interpretability. (b) Multiview Alignment (MA): Bridges the semantic gap between WSI and genomic features using sample-wise alignment and mutual information maximization. (c) Bipartite fusion (BF): Integrates shared and modality-specific features to optimize multimodal fusion. (d) Semantic Genomics Imputation (SGI): Handles missing genomic data scenarios by imputing features directly in the feature space.
  • Figure 2: Multiview Alignment (MA): Bridges the semantic gap between WSI and genomic features using mutual information maximization and sample-wise alignment. Left: Maximize mutual information in MA. Right: Sample-wise consistency.
  • Figure 3: Semantic Genomics Imputation (SGI): A CycleGAN-based network to generate genomic features from histology features, with a progressive interpolation to ensure robustness in various settings.
  • Figure 4: Bipartite Fusion: Utilizing prototype-wise affinity to integrate shared and modality-specific features for multimodal fusion.
  • Figure 5: Interpretability heatmaps for glioma analysis across histology and genomics. (a) Glioma Grading: patients are grouped in Grade 2-4. (b) Glioma Diagnosis (WHO 2021): GBM Grade 4, Astrocytoma Grade 4, Astrocytoma Grade 3, Astrocytoma Grade 2, Oligodendroglioma Grade 3, Oligodendroglioma Grade 2. In each task, upper heatmaps display histology prototypes (e.g., Neoplastic, Necrotic), while lower heatmaps illustrate the importance of genomic prototypes (e.g., Protein Kinase, Oncogenes). The importance score is normalized per patient. Our model demonstrates strong interpretability by identifying biologically meaningful markers for glioma grading and diagnosis.
  • ...and 3 more figures