MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images
Aniruddha Ganguly, Debolina Chatterjee, Wentao Huang, Jie Zhang, Alisa Yurovsky, Travis Steele Johnson, Chao Chen
TL;DR
MERGE addresses spatial gene-expression prediction from whole-slide histopathology by constructing a multi-faceted hierarchical graph that captures both local morphology and long-range interactions among tissue patches. It combines spatial and feature-space clustering to form intra-cluster edges and provides centroid-based shortcut edges to enable efficient information flow via a Graph Attention Network, while employing a ResNet18-based patch encoder and a gene-aware smoothing technique (SPCS). The approach yields superior predictive performance across datasets, outperforming state-of-the-art baselines on metrics like $PCC$, $MSE$, and $MAE$, and is supported by qualitative heatmaps and ablation analyses that validate the design choices. Together, MERGE advances robust, morphology-guided gene-expression prediction from WSIs with practical implications for leveraging histology to infer spatial transcriptomics profiles in clinical and research settings.
Abstract
Recent advances in Spatial Transcriptomics (ST) pair histology images with spatially resolved gene expression profiles, enabling predictions of gene expression across different tissue locations based on image patches. This opens up new possibilities for enhancing whole slide image (WSI) prediction tasks with localized gene expression. However, existing methods fail to fully leverage the interactions between different tissue locations, which are crucial for accurate joint prediction. To address this, we introduce MERGE (Multi-faceted hiErarchical gRaph for Gene Expressions), which combines a multi-faceted hierarchical graph construction strategy with graph neural networks (GNN) to improve gene expression predictions from WSIs. By clustering tissue image patches based on both spatial and morphological features, and incorporating intra- and inter-cluster edges, our approach fosters interactions between distant tissue locations during GNN learning. As an additional contribution, we evaluate different data smoothing techniques that are necessary to mitigate artifacts in ST data, often caused by technical imperfections. We advocate for adopting gene-aware smoothing methods that are more biologically justified. Experimental results on gene expression prediction show that our GNN method outperforms state-of-the-art techniques across multiple metrics.
