Table of Contents
Fetching ...

Generating crossmodal gene expression from cancer histopathology improves multimodal AI predictions

Samiran Dey, Christopher R. S. Banerji, Partha Basuchowdhuri, Sanjoy K. Saha, Deepak Parashar, Tapabrata Chakraborti

TL;DR

A crossmodal generative model, PathGen, is developed to synthesise transcriptomic data from histopathology slides, and how the combination of these multimodal data improves cancer diagnosis and prognosis prediction.

Abstract

Emerging research has highlighted that artificial intelligence-based multimodal fusion of digital pathology and transcriptomic features can improve cancer diagnosis (grading/subtyping) and prognosis (survival risk) prediction. However, such direct fusion is impractical in clinical settings, where histopathology remains the gold standard and transcriptomic tests are rarely requested in public healthcare. We experiment on two publicly available multimodal datasets, The Cancer Genomic Atlas and the Clinical Proteomic Tumor Analysis Consortium, spanning four independent cohorts: glioma-glioblastoma, renal, uterine, and breast, and observe significant performance gains in gradation and risk estimation (p-value<0.05) when incorporating synthesized transcriptomic data with WSIs. Also, predictions using synthesized features were statistically close to those obtained with real transcriptomic data (p-value>0.05), consistently across cohorts. Here we show that with our diffusion based crossmodal generative AI model, PathGen, gene expressions synthesized from digital histopathology jointly predict cancer grading and patient survival risk with high accuracy (state-of-the-art performance), certainty (through conformal coverage guarantee) and interpretability (through distributed co-attention maps). PathGen code is available for open use on GitHub at https://github.com/Samiran-Dey/PathGen.

Generating crossmodal gene expression from cancer histopathology improves multimodal AI predictions

TL;DR

A crossmodal generative model, PathGen, is developed to synthesise transcriptomic data from histopathology slides, and how the combination of these multimodal data improves cancer diagnosis and prognosis prediction.

Abstract

Emerging research has highlighted that artificial intelligence-based multimodal fusion of digital pathology and transcriptomic features can improve cancer diagnosis (grading/subtyping) and prognosis (survival risk) prediction. However, such direct fusion is impractical in clinical settings, where histopathology remains the gold standard and transcriptomic tests are rarely requested in public healthcare. We experiment on two publicly available multimodal datasets, The Cancer Genomic Atlas and the Clinical Proteomic Tumor Analysis Consortium, spanning four independent cohorts: glioma-glioblastoma, renal, uterine, and breast, and observe significant performance gains in gradation and risk estimation (p-value<0.05) when incorporating synthesized transcriptomic data with WSIs. Also, predictions using synthesized features were statistically close to those obtained with real transcriptomic data (p-value>0.05), consistently across cohorts. Here we show that with our diffusion based crossmodal generative AI model, PathGen, gene expressions synthesized from digital histopathology jointly predict cancer grading and patient survival risk with high accuracy (state-of-the-art performance), certainty (through conformal coverage guarantee) and interpretability (through distributed co-attention maps). PathGen code is available for open use on GitHub at https://github.com/Samiran-Dey/PathGen.

Paper Structure

This paper contains 30 sections, 16 equations, 12 figures, 14 tables.

Figures (12)

  • Figure 1: Methodology overview. Pipeline for the proposed methodology. Our diffusion-based crossmodal generative model, PathGen, synthesizes transcriptomic features from whole slide image patch embeddings obtained using a state-of-the-art foundation model from the whole slide images. Multimodal prediction is performed for both diagnosis (cancer grading) and prognosis (survival risk) using the synthesized transcriptomic data and histopathology images to predict the added value of transcriptomic data, for each patient. Uncertainty quantification provides a patient-level estimate of model reliability and provides clinicians with survival risk bounds and tumour grade prediction sets which are guaranteed to contain the ground truth with a specified probability. Distributed predictions also provide a window to understand the intra-tumour heterogeneity.
  • Figure 2: Evaluation of synthesized transcriptomic data. (a) Comparison between real and synthesized gene expression levels over all genes and for different gene groups. (b) Plot of significant performance improvement on using synthesized transcriptomic data over using WSIs alone (marked as without), and closeness of using synthesized and real transcriptomic data for survival risk estimation and gradation. (c) Plot for comparison of uncertainty and obtained coverage for survival risk and grade predictions using real and synthesized transcriptomic data. Source data are provided as a Source Data file.
  • Figure 3: Explainability maps for a male patient of age 58, ground truth grade III, survival time 10.71 months, and survival status alive from the TCGA-LGG cohort. (a) whole slide image (WSI) (b) intra-tumour gradation heterogeneity (c) intra-tumour survival heterogeneity (d) co-attention maps for real and synthesized transcriptomic data and WSI patches for different gene groups (e) comparative study of intra-tumour heterogeneity and corresponding co-attention maps obtained for prediction using synthesized transcriptomic data for chosen regions marked with different coloured rectangles.
  • Figure 4: Explainability maps for a male patient of age 40, ground truth grade IV, survival time 33.99 months, and survival status deceased from the TCGA-KIRC cohort. (a) whole slide image (WSI) (b) intra-tumour gradation heterogeneity (c) intra-tumour survival heterogeneity (d) co-attention maps for real and synthesized transcriptomic data and WSI patches for different gene groups (e) comparative study of intra-tumour heterogeneity and corresponding co-attention maps obtained for prediction using synthesized transcriptomic data for chosen regions marked with different coloured rectangles.
  • Figure 5: Explainability analysis. (a) (i) Plot for comparison between distributed and non-distributed WSI predictions for gradation using both real and synthesized transcriptomic data. (ii) Plot showing normalized average co-attention value for the true grade is higher than that of false grades. (b) Comparison between co-attention maps obtained using real and synthesized transcriptomic data over all genes and for different gene groups. (c) Plot illustrating the percentage contribution of the gene groups in co-attention for different TCGA and CPTAC data cohorts. Source data are provided as a Source Data file.
  • ...and 7 more figures