Pathology-genomic fusion via biologically informed cross-modality graph learning for survival analysis
Zeyu Zhang, Yuanshen Zhao, Jingxian Duan, Yaou Liu, Hairong Zheng, Dong Liang, Zhenyu Zhang, Zhi-Cheng Li
TL;DR
This work tackles survival prediction by fusing histology and transcriptomics through a biology-informed heterogeneous graph (PGHG). It builds pathology and genomic subgraphs guided by prior biological knowledge, supervises pathology features with GSVA pathway scores, and learns cross-modal representations via a graph attention network to produce robust prognostic signals. The approach demonstrates superior performance over unimodal and other multimodal methods across multiple TCGA and FAHZU datasets, with interpretable results highlighting tissue structures and pathways linked to prognosis. The framework offers a scalable, interpretable means to integrate multi-modal clinical data and identify potential biomarkers for cancer survival.
Abstract
The diagnosis and prognosis of cancer are typically based on multi-modal clinical data, including histology images and genomic data, due to the complex pathogenesis and high heterogeneity. Despite the advancements in digital pathology and high-throughput genome sequencing, establishing effective multi-modal fusion models for survival prediction and revealing the potential association between histopathology and transcriptomics remains challenging. In this paper, we propose Pathology-Genome Heterogeneous Graph (PGHG) that integrates whole slide images (WSI) and bulk RNA-Seq expression data with heterogeneous graph neural network for cancer survival analysis. The PGHG consists of biological knowledge-guided representation learning network and pathology-genome heterogeneous graph. The representation learning network utilizes the biological prior knowledge of intra-modal and inter-modal data associations to guide the feature extraction. The node features of each modality are updated through attention-based graph learning strategy. Unimodal features and bi-modal fused features are extracted via attention pooling module and then used for survival prediction. We evaluate the model on low-grade gliomas, glioblastoma, and kidney renal papillary cell carcinoma datasets from the Cancer Genome Atlas (TCGA) and the First Affiliated Hospital of Zhengzhou University (FAHZU). Extensive experimental results demonstrate that the proposed method outperforms both unimodal and other multi-modal fusion models. For demonstrating the model interpretability, we also visualize the attention heatmap of pathological images and utilize integrated gradient algorithm to identify important tissue structure, biological pathways and key genes.
