Table of Contents
Fetching ...

Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction

Guillaume Jaume, Anurag Vaidya, Richard Chen, Drew Williamson, Paul Liang, Faisal Mahmood

TL;DR

SURVPath tackles the problem of predicting patient survival by fusing spatial histology with global transcriptomics. It introduces a pathway-based tokenizer to semantically encode gene sets and a memory-efficient Transformer to fuse pathway tokens with histology patches in an early fusion framework. The work provides a multi-level interpretability pipeline using Integrated Gradients and attention heatmaps to reveal genotype–phenotype interactions. On five TCGA cohorts, SURVPath achieves state-of-the-art survival prediction and offers biologically plausible insights for identifying prognostic biomarkers.

Abstract

Integrating whole-slide images (WSIs) and bulk transcriptomics for predicting patient survival can improve our understanding of patient prognosis. However, this multimodal task is particularly challenging due to the different nature of these data: WSIs represent a very high-dimensional spatial description of a tumor, while bulk transcriptomics represent a global description of gene expression levels within that tumor. In this context, our work aims to address two key challenges: (1) how can we tokenize transcriptomics in a semantically meaningful and interpretable way?, and (2) how can we capture dense multimodal interactions between these two modalities? Specifically, we propose to learn biological pathway tokens from transcriptomics that can encode specific cellular functions. Together with histology patch tokens that encode the different morphological patterns in the WSI, we argue that they form appropriate reasoning units for downstream interpretability analyses. We propose fusing both modalities using a memory-efficient multimodal Transformer that can model interactions between pathway and histology patch tokens. Our proposed model, SURVPATH, achieves state-of-the-art performance when evaluated against both unimodal and multimodal baselines on five datasets from The Cancer Genome Atlas. Our interpretability framework identifies key multimodal prognostic factors, and, as such, can provide valuable insights into the interaction between genotype and phenotype, enabling a deeper understanding of the underlying biological mechanisms at play. We make our code public at: https://github.com/ajv012/SurvPath.

Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction

TL;DR

SURVPath tackles the problem of predicting patient survival by fusing spatial histology with global transcriptomics. It introduces a pathway-based tokenizer to semantically encode gene sets and a memory-efficient Transformer to fuse pathway tokens with histology patches in an early fusion framework. The work provides a multi-level interpretability pipeline using Integrated Gradients and attention heatmaps to reveal genotype–phenotype interactions. On five TCGA cohorts, SURVPath achieves state-of-the-art survival prediction and offers biologically plausible insights for identifying prognostic biomarkers.

Abstract

Integrating whole-slide images (WSIs) and bulk transcriptomics for predicting patient survival can improve our understanding of patient prognosis. However, this multimodal task is particularly challenging due to the different nature of these data: WSIs represent a very high-dimensional spatial description of a tumor, while bulk transcriptomics represent a global description of gene expression levels within that tumor. In this context, our work aims to address two key challenges: (1) how can we tokenize transcriptomics in a semantically meaningful and interpretable way?, and (2) how can we capture dense multimodal interactions between these two modalities? Specifically, we propose to learn biological pathway tokens from transcriptomics that can encode specific cellular functions. Together with histology patch tokens that encode the different morphological patterns in the WSI, we argue that they form appropriate reasoning units for downstream interpretability analyses. We propose fusing both modalities using a memory-efficient multimodal Transformer that can model interactions between pathway and histology patch tokens. Our proposed model, SURVPATH, achieves state-of-the-art performance when evaluated against both unimodal and multimodal baselines on five datasets from The Cancer Genome Atlas. Our interpretability framework identifies key multimodal prognostic factors, and, as such, can provide valuable insights into the interaction between genotype and phenotype, enabling a deeper understanding of the underlying biological mechanisms at play. We make our code public at: https://github.com/ajv012/SurvPath.
Paper Structure (24 sections, 3 equations, 6 figures, 6 tables)

This paper contains 24 sections, 3 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Multimodal interpretability with $\textsc{SurvPath}$.$\textsc{SurvPath}$ enables visualization of multimodal interactions via a Transformer cross-attention between biological pathways and morphological patterns, here exemplified in a high-risk breast cancer. The chord thickness denotes attention weight.
  • Figure 1: Multi-level interpretability framework. From the multimodal input consisting of a WSI and transcriptomic measurements, and the predicted risk, we can attribute risk at slide-, gene- and biological pathway-level. The framework also enables studying pathway-to-patch interactions and patch-to-pathway interactions for unravelling correspondences between the two modalities.
  • Figure 2: Block diagram of $\textsc{SurvPath}$. (1) We tokenize transcriptomics into biological pathway tokens that are semantically meaningful, interpretable, and end-to-end learnable. (2) We further tokenize the corresponding histology whole-slide image into patch tokens using an SSL pre-trained feature extractor. (3) We combine pathway and patch tokens using a memory-efficient multimodal Transformer for survival outcome prediction.
  • Figure 2: Multi-level interpretability visualization in a bladder cancer patient.Top: Low-risk patient. Bottom: High-risk patient. Genes and pathways in red increase risk, and those in blue decrease risk. Heatmap colors indicate importance, with red indicating high importance and blue indicating low importance. The pathways and morphologies identified as important in these cases generally correspond well with patterns that have been previously described in bladder urothelial carcinoma (e.g., the G2M checkpoint).
  • Figure 3: Multi-level interpretability visualization in a breast cancer patient.Top: Low-risk patient. Bottom: High-risk patient. Genes and pathways in red increase risk, and those in blue decrease risk. Heatmap colors indicate importance, with red indicating high importance and blue indicating low importance. The pathways and morphologies identified as important in these cases generally correspond well with patterns that have been previously described in invasive breast cancer (e.g. Estrogen Response Late).
  • ...and 1 more figures