A Novel Approach to Linking Histology Images with DNA Methylation
Manahil Raza, Muhammad Dawood, Talha Qaiser, Nasir M. Rajpoot
TL;DR
This work introduces SlideGraph^methyl, a graph neural network framework that weakly learns to predict gene-group differential DNA methylation states from whole-slide histology images. By constructing WSI graphs from patch features and employing a pairwise ranking objective, the method achieves superior AUROC and AP performance compared with state-of-the-art baselines across TCGA glioma and renal carcinoma cohorts, and it reveals biologically meaningful enrichment via GSEA and spatially resolved heatmaps. The approach demonstrates that spatial histopathology patterns can serve as digital biomarkers for epigenetic states, potentially enabling faster, image-based cancer stratification alongside traditional methylation assays. The study also provides insights into tumor biology by linking visual patterns to methylation-driven pathways, with plans to extend to multi-modal data and additional cancer types.
Abstract
DNA methylation is an epigenetic mechanism that regulates gene expression by adding methyl groups to DNA. Abnormal methylation patterns can disrupt gene expression and have been linked to cancer development. To quantify DNA methylation, specialized assays are typically used. However, these assays are often costly and have lengthy processing times, which limits their widespread availability in routine clinical practice. In contrast, whole slide images (WSIs) for the majority of cancer patients can be more readily available. As such, given the ready availability of WSIs, there is a compelling need to explore the potential relationship between WSIs and DNA methylation patterns. To address this, we propose an end-to-end graph neural network based weakly supervised learning framework to predict the methylation state of gene groups exhibiting coherent patterns across samples. Using data from three cohorts from The Cancer Genome Atlas (TCGA) - TCGA-LGG (Brain Lower Grade Glioma), TCGA-GBM (Glioblastoma Multiforme) ($n$=729) and TCGA-KIRC (Kidney Renal Clear Cell Carcinoma) ($n$=511) - we demonstrate that the proposed approach achieves significantly higher AUROC scores than the state-of-the-art (SOTA) methods, by more than $20\%$. We conduct gene set enrichment analyses on the gene groups and show that majority of the gene groups are significantly enriched in important hallmarks and pathways. We also generate spatially enriched heatmaps to further investigate links between histological patterns and DNA methylation states. To the best of our knowledge, this is the first study that explores association of spatially resolved histological patterns with gene group methylation states across multiple cancer types using weakly supervised deep learning.
