Interpretable Multimodal Cancer Prototyping with Whole Slide Images and Incompletely Paired Genomics
Yupei Zhang, Yating Huang, Wanming Hu, Lequan Yu, Hujun Yin, Chao Li
TL;DR
This work tackles the challenge of integrating WSIs and incompletely paired genomics for cancer precision medicine by introducing a biologically grounded prototyping framework. It jointly learns intra-modal representations through biologically informed prototypes, aligns modalities via distribution- and sample-wise strategies, performs selective cross-modal fusion with a bipartite scheme, and robustly imputes missing genomics with a semantic sgI module. The approach yields state-of-the-art results on glioma diagnosis, grading, and survival while offering interpretable insights through prototype importance and cross-modal interactions. The findings suggest strong potential for clinically robust deployment in settings with incomplete molecular data, with avenues for extending to other modalities and multi-institutional datasets.
Abstract
Multimodal approaches that integrate histology and genomics hold strong potential for precision oncology. However, phenotypic and genotypic heterogeneity limits the quality of intra-modal representations and hinders effective inter-modal integration. Furthermore, most existing methods overlook real-world clinical scenarios where genomics may be partially missing or entirely unavailable. We propose a flexible multimodal prototyping framework to integrate whole slide images and incomplete genomics for precision oncology. Our approach has four key components: 1) Biological Prototyping using text prompting and prototype-wise weighting; 2) Multiview Alignment through sample- and distribution-wise alignments; 3) Bipartite Fusion to capture both shared and modality-specific information for multimodal fusion; and 4) Semantic Genomics Imputation to handle missing data. Extensive experiments demonstrate the consistent superiority of the proposed method compared to other state-of-the-art approaches on multiple downstream tasks. The code is available at https://github.com/helenypzhang/Interpretable-Multimodal-Prototyping.
