PEaRL: Pathway-Enhanced Representation Learning for Gene and Pathway Expression Prediction from Histology
Sejuti Majumder, Saarthak Kapse, Moinak Bhattacharya, Xuan Xu, Alisa Yurovsky, Prateek Prasanna
TL;DR
PEaRL addresses the challenge of linking tissue morphology with molecular function by representing spatial transcriptomics through pathway activation scores computed with ssGSEA and aligning these pathway representations to histology via a transformer-based encoder and contrastive learning. The two-stage training workflow first learns a shared latent space for pathways and images, then trains lightweight heads to predict both pathway and gene expression, achieving superior PCCs across three cancer ST datasets and improving survival prognostics. Ablation studies highlight the importance of grounding transcriptomic signals in pathways and the advantage of using the UNI foundation model for histology features. Overall, PEaRL provides a more biologically faithful and interpretable multimodal framework that advances computational pathology beyond gene-level embeddings and paves the way for pan-cancer analyses and clinical biomarker discovery.
Abstract
Integrating histopathology with spatial transcriptomics (ST) provides a powerful opportunity to link tissue morphology with molecular function. Yet most existing multimodal approaches rely on a small set of highly variable genes, which limits predictive scope and overlooks the coordinated biological programs that shape tissue phenotypes. We present PEaRL (Pathway Enhanced Representation Learning), a multimodal framework that represents transcriptomics through pathway activation scores computed with ssGSEA. By encoding biologically coherent pathway signals with a transformer and aligning them with histology features via contrastive learning, PEaRL reduces dimensionality, improves interpretability, and strengthens cross-modal correspondence. Across three cancer ST datasets (breast, skin, and lymph node), PEaRL consistently outperforms SOTA methods, yielding higher accuracy for both gene- and pathway-level expression prediction (up to 58.9 percent and 20.4 percent increase in Pearson correlation coefficient compared to SOTA). These results demonstrate that grounding transcriptomic representation in pathways produces more biologically faithful and interpretable multimodal models, advancing computational pathology beyond gene-level embeddings.
