Genomics-guided Representation Learning for Pathologic Pan-cancer Tumor Microenvironment Subtype Prediction
Fangliangzi Meng, Hongrun Zhang, Ruodan Yan, Guohui Chuai, Chao Li, Qi Liu
TL;DR
Pan-cancer TME subtype prediction from WSIs is challenged by heterogeneity and subtle microenvironment cues. PathoTME introduces a genomics-guided Siamese representation with visual prompts and domain-adversarial training to align WSI embeddings with gene information while mitigating tissue-origin bias, enabling robust cross-cancer TME subtype classification without needing genomic data at inference. The approach demonstrates superior performance across 23 TCGA datasets, with ablations confirming the complementary benefits of Siamese guidance, DANN, and visual prompts, and shows improved stability compared to single-dataset baselines. This work advances practical, genomics-informed histopathology analysis for precision oncology, offering a scalable framework for pan-cancer TME profiling and potential clinical deployment with reduced dependency on genomic assays.
Abstract
The characterization of Tumor MicroEnvironment (TME) is challenging due to its complexity and heterogeneity. Relatively consistent TME characteristics embedded within highly specific tissue features, render them difficult to predict. The capability to accurately classify TME subtypes is of critical significance for clinical tumor diagnosis and precision medicine. Based on the observation that tumors with different origins share similar microenvironment patterns, we propose PathoTME, a genomics-guided Siamese representation learning framework employing Whole Slide Image (WSI) for pan-cancer TME subtypes prediction. Specifically, we utilize Siamese network to leverage genomic information as a regularization factor to assist WSI embeddings learning during the training phase. Additionally, we employ Domain Adversarial Neural Network (DANN) to mitigate the impact of tissue type variations. To eliminate domain bias, a dynamic WSI prompt is designed to further unleash the model's capabilities. Our model achieves better performance than other state-of-the-art methods across 23 cancer types on TCGA dataset. Our code is available at https://github.com/Mengflz/PathoTME.
