Histology-informed tiling of whole tissue sections improves the interpretability and predictability of cancer relapse and genetic alterations

Willem Bonnaffé; Yang Hu; Andrea Chatrian; Mengran Fan; Stefano Malacrino; Sandy Figiel; CRUK ICGC Prostate Group; Srinivasa R. Rao; Richard Colling; Richard J. Bryant; Freddie C. Hamdy; Dan J. Woodcock; Ian G. Mills; Clare Verrill; Jens Rittscher

Histology-informed tiling of whole tissue sections improves the interpretability and predictability of cancer relapse and genetic alterations

Willem Bonnaffé, Yang Hu, Andrea Chatrian, Mengran Fan, Stefano Malacrino, Sandy Figiel, CRUK ICGC Prostate Group, Srinivasa R. Rao, Richard Colling, Richard J. Bryant, Freddie C. Hamdy, Dan J. Woodcock, Ian G. Mills, Clare Verrill, Jens Rittscher

TL;DR

Grid-based tiling in digital pathology discards tissue architecture, limiting interpretability and predictive power. HIT uses semantic segmentation (GlandSeg) to extract gland-centric patches from WSIs for MIL and gland morphology phenotyping, followed by clustering and pooled MIL predictions. In ProMPT/ICGC-C/TCGA-PRAD data, HIT achieves a gland Dice of $0.83 ± 0.17$ and stroma Dice of $0.91 ± 0.12$, and extracts $380,000$ glands from $760$ WSIs, with MIL AUC improvements of about $10 ext{%}$ for CNV detection in EMT-related genes and MYC; 15 gland morphology clusters correlate with relapse and Gleason patterns. This approach offers improved interpretability and efficiency by focusing on biologically meaningful structures and suggests broad applicability to other epithelial cancers.

Abstract

Histopathologists establish cancer grade by assessing histological structures, such as glands in prostate cancer. Yet, digital pathology pipelines often rely on grid-based tiling that ignores tissue architecture. This introduces irrelevant information and limits interpretability. We introduce histology-informed tiling (HIT), which uses semantic segmentation to extract glands from whole slide images (WSIs) as biologically meaningful input patches for multiple-instance learning (MIL) and phenotyping. Trained on 137 samples from the ProMPT cohort, HIT achieved a gland-level Dice score of 0.83 +/- 0.17. By extracting 380,000 glands from 760 WSIs across ICGC-C and TCGA-PRAD cohorts, HIT improved MIL models AUCs by 10% for detecting copy number variation (CNVs) in genes related to epithelial-mesenchymal transitions (EMT) and MYC, and revealed 15 gland clusters, several of which were associated with cancer relapse, oncogenic mutations, and high Gleason. Therefore, HIT improved the accuracy and interpretability of MIL predictions, while streamlining computations by focussing on biologically meaningful structures during feature extraction.

Histology-informed tiling of whole tissue sections improves the interpretability and predictability of cancer relapse and genetic alterations

TL;DR

Abstract

Histology-informed tiling of whole tissue sections improves the interpretability and predictability of cancer relapse and genetic alterations

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)