Table of Contents
Fetching ...

Histology-informed tiling of whole tissue sections improves the interpretability and predictability of cancer relapse and genetic alterations

Willem Bonnaffé, Yang Hu, Andrea Chatrian, Mengran Fan, Stefano Malacrino, Sandy Figiel, CRUK ICGC Prostate Group, Srinivasa R. Rao, Richard Colling, Richard J. Bryant, Freddie C. Hamdy, Dan J. Woodcock, Ian G. Mills, Clare Verrill, Jens Rittscher

TL;DR

Grid-based tiling in digital pathology discards tissue architecture, limiting interpretability and predictive power. HIT uses semantic segmentation (GlandSeg) to extract gland-centric patches from WSIs for MIL and gland morphology phenotyping, followed by clustering and pooled MIL predictions. In ProMPT/ICGC-C/TCGA-PRAD data, HIT achieves a gland Dice of $0.83 ± 0.17$ and stroma Dice of $0.91 ± 0.12$, and extracts $380,000$ glands from $760$ WSIs, with MIL AUC improvements of about $10 ext{%}$ for CNV detection in EMT-related genes and MYC; 15 gland morphology clusters correlate with relapse and Gleason patterns. This approach offers improved interpretability and efficiency by focusing on biologically meaningful structures and suggests broad applicability to other epithelial cancers.

Abstract

Histopathologists establish cancer grade by assessing histological structures, such as glands in prostate cancer. Yet, digital pathology pipelines often rely on grid-based tiling that ignores tissue architecture. This introduces irrelevant information and limits interpretability. We introduce histology-informed tiling (HIT), which uses semantic segmentation to extract glands from whole slide images (WSIs) as biologically meaningful input patches for multiple-instance learning (MIL) and phenotyping. Trained on 137 samples from the ProMPT cohort, HIT achieved a gland-level Dice score of 0.83 +/- 0.17. By extracting 380,000 glands from 760 WSIs across ICGC-C and TCGA-PRAD cohorts, HIT improved MIL models AUCs by 10% for detecting copy number variation (CNVs) in genes related to epithelial-mesenchymal transitions (EMT) and MYC, and revealed 15 gland clusters, several of which were associated with cancer relapse, oncogenic mutations, and high Gleason. Therefore, HIT improved the accuracy and interpretability of MIL predictions, while streamlining computations by focussing on biologically meaningful structures during feature extraction.

Histology-informed tiling of whole tissue sections improves the interpretability and predictability of cancer relapse and genetic alterations

TL;DR

Grid-based tiling in digital pathology discards tissue architecture, limiting interpretability and predictive power. HIT uses semantic segmentation (GlandSeg) to extract gland-centric patches from WSIs for MIL and gland morphology phenotyping, followed by clustering and pooled MIL predictions. In ProMPT/ICGC-C/TCGA-PRAD data, HIT achieves a gland Dice of and stroma Dice of , and extracts glands from WSIs, with MIL AUC improvements of about for CNV detection in EMT-related genes and MYC; 15 gland morphology clusters correlate with relapse and Gleason patterns. This approach offers improved interpretability and efficiency by focusing on biologically meaningful structures and suggests broad applicability to other epithelial cancers.

Abstract

Histopathologists establish cancer grade by assessing histological structures, such as glands in prostate cancer. Yet, digital pathology pipelines often rely on grid-based tiling that ignores tissue architecture. This introduces irrelevant information and limits interpretability. We introduce histology-informed tiling (HIT), which uses semantic segmentation to extract glands from whole slide images (WSIs) as biologically meaningful input patches for multiple-instance learning (MIL) and phenotyping. Trained on 137 samples from the ProMPT cohort, HIT achieved a gland-level Dice score of 0.83 +/- 0.17. By extracting 380,000 glands from 760 WSIs across ICGC-C and TCGA-PRAD cohorts, HIT improved MIL models AUCs by 10% for detecting copy number variation (CNVs) in genes related to epithelial-mesenchymal transitions (EMT) and MYC, and revealed 15 gland clusters, several of which were associated with cancer relapse, oncogenic mutations, and high Gleason. Therefore, HIT improved the accuracy and interpretability of MIL predictions, while streamlining computations by focussing on biologically meaningful structures during feature extraction.

Paper Structure

This paper contains 4 sections, 8 figures.

Figures (8)

  • Figure 1: Overview of the histology-informed tiling (HIT) framework and its applications. Panel A shows the difference between grid tiling and HIT, also named semantic tiling. Panel B shows the steps required to perform semantic tiling, which involves semantic segmentation and object extraction. Panel C provides a first application for phenotyping cancer tissue morphology with dimensionality reduction, contrastive learning (where the dashed-line box indicates an optional step), and clustering. Panel D displays a second application in multiple instance learning based on patches extracted by HIT.
  • Figure 2: Performance and examplar results from the segmentation and object extraction pipeline. Panel A displays the accuracy (Dice score) of the model in segmenting different compartments of the tissue: stroma, epithelium, lumen, and glands (epithelium and lumen together). Panel B shows examples of predicted masks compared to ground truth annotations for randomly selected patches. Panel C demonstrates the slide-level mask obtained after stitching patch-level predictions and the object extraction step. Panel D shows examples of semantically-extracted tiles, where each patch contains an individual glandular structure. The background is coloured black, epithelia green, lumen blue, and nuclei red.
  • Figure 3: Phenotyping of glandular structures and comparison with pathologists’ annotations. Panel A shows a two-dimensional projection of gland embeddings of glands extracted from FFPE samples from TCGA-PRAD. Each dot corresponds to a gland and the colours indicate the 1–15 morphological clusters identified through HAC. Panel B displays the corresponding Gleason grade associated with these glands. Panel C shows randomly selected examples of tiles from each cluster (black: background, green: epithelium, blue: lumen, and red: nuclei), arranged according to a dendrogram of cluster centroids, which groups the most morphologically similar clusters. Panel D shows a comparison of the proportion of nuclei in each tile across the different clusters.
  • Figure 4: Impact of HIT and contrastive learning on performance of multiple-instance learning (MIL) models for relapse and CNV detection. Panel A shows a benchmark of the accuracy (AUC) of a MIL pipeline in detecting BCR (biochemical recurrence, or relapse) in FFPE patient samples from the ICGC-C cohort, EMT genes copy number variation (EMT-CNV) and copy number gain of the MYC gene in OCT patient samples from the TCGA-PRAD cohort. The y-axis labels indicate which method was used for tiling (GT: grid tiling, or ST: semantic tiling a.k.a. HIT) and which backbone was used for compression (ResNet-18, ResNet-50, or ViT). Panel B shows the effects of increasing the number of epochs of contrastive learning on AUC for the ResNet-18 backbone on the three MIL tasks. CL1–25 indicate the number of epochs of contrastive learning fine tuning. In both panels, the aggregator architecture used was CLAM (Lu_2021).
  • Figure 5: Phenotyping of glands associated with copy number gain of EMT-related genes. Panel A displays a two-dimensional projection of the embeddings of glands extracted from OCT samples of the TCGA-PRAD cohort. The colour indicates the index of the 1–15 clusters established through HAC. Panel B shows the corresponding prediction-attention-weighted scores for each instance (i.e. EMT scores), where blue and red dots are instances that decrease or increase the probability of EMT gain, respectively. White dots are intermediate instances that neither increase nor decrease the probability. Panel C shows examples from each identified cluster (with epithelium in green, lumen in blue, and nuclei in red). Relationships between morphologically similar clusters are indicated by a dendrogram. Panel D shows the frequency of instances with low (blue), intermediate (yellow), and high (red) EMT scores, and the corresponding proportion of nuclei, within each cluster.
  • ...and 3 more figures