Table of Contents
Fetching ...

Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology

Andrew H. Song, Richard J. Chen, Tong Ding, Drew F. K. Williamson, Guillaume Jaume, Faisal Mahmood

TL;DR

Panther introduces an unsupervised, prototype-based framework for summarizing whole-slide image representations in computational pathology by modeling patch embeddings with a Gaussian mixture model and extracting per-prototype statistics as a fixed-length slide embedding. This approach captures morphological heterogeneity through a compact set of prototypes, enabling strong performance on subtyping and survival tasks while also providing interpretable prototype assignment maps. Panther demonstrates competitive or superior results relative to both unsupervised and supervised MIL baselines across 13 datasets, and its interpretability reveals morphologically meaningful tissue concepts such as tumor, stroma, and immune patterns. The work advances task-agnostic slide representation learning with a scalable, interpretable framework that can generalize across histology domains and downstream analyses.

Abstract

Representation learning of pathology whole-slide images (WSIs) has been has primarily relied on weak supervision with Multiple Instance Learning (MIL). However, the slide representations resulting from this approach are highly tailored to specific clinical tasks, which limits their expressivity and generalization, particularly in scenarios with limited data. Instead, we hypothesize that morphological redundancy in tissue can be leveraged to build a task-agnostic slide representation in an unsupervised fashion. To this end, we introduce PANTHER, a prototype-based approach rooted in the Gaussian mixture model that summarizes the set of WSI patches into a much smaller set of morphological prototypes. Specifically, each patch is assumed to have been generated from a mixture distribution, where each mixture component represents a morphological exemplar. Utilizing the estimated mixture parameters, we then construct a compact slide representation that can be readily used for a wide range of downstream tasks. By performing an extensive evaluation of PANTHER on subtyping and survival tasks using 13 datasets, we show that 1) PANTHER outperforms or is on par with supervised MIL baselines and 2) the analysis of morphological prototypes brings new qualitative and quantitative insights into model interpretability.

Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology

TL;DR

Panther introduces an unsupervised, prototype-based framework for summarizing whole-slide image representations in computational pathology by modeling patch embeddings with a Gaussian mixture model and extracting per-prototype statistics as a fixed-length slide embedding. This approach captures morphological heterogeneity through a compact set of prototypes, enabling strong performance on subtyping and survival tasks while also providing interpretable prototype assignment maps. Panther demonstrates competitive or superior results relative to both unsupervised and supervised MIL baselines across 13 datasets, and its interpretability reveals morphologically meaningful tissue concepts such as tumor, stroma, and immune patterns. The work advances task-agnostic slide representation learning with a scalable, interpretable framework that can generalize across histology domains and downstream analyses.

Abstract

Representation learning of pathology whole-slide images (WSIs) has been has primarily relied on weak supervision with Multiple Instance Learning (MIL). However, the slide representations resulting from this approach are highly tailored to specific clinical tasks, which limits their expressivity and generalization, particularly in scenarios with limited data. Instead, we hypothesize that morphological redundancy in tissue can be leveraged to build a task-agnostic slide representation in an unsupervised fashion. To this end, we introduce PANTHER, a prototype-based approach rooted in the Gaussian mixture model that summarizes the set of WSI patches into a much smaller set of morphological prototypes. Specifically, each patch is assumed to have been generated from a mixture distribution, where each mixture component represents a morphological exemplar. Utilizing the estimated mixture parameters, we then construct a compact slide representation that can be readily used for a wide range of downstream tasks. By performing an extensive evaluation of PANTHER on subtyping and survival tasks using 13 datasets, we show that 1) PANTHER outperforms or is on par with supervised MIL baselines and 2) the analysis of morphological prototypes brings new qualitative and quantitative insights into model interpretability.
Paper Structure (27 sections, 13 equations, 7 figures, 5 tables)

This paper contains 27 sections, 13 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Slide decomposition into morphological prototypes Due to morphological redundancy across and within the tissue, a slide can be decomposed into prototypes. We introduce $\textsc{Panther}$, a method that can identify and extract morphological prototypes to form a compact and unsupervised slide representation.
  • Figure 2: Overview of $\textsc{Panther}$ workflow. Whole-slide image (WSI) is segmented and patched into a set of WSI patches. A compressed feature for each patch is encoded through a feature extractor pretrained on a large histopathology dataset. $\textsc{Panther}$ uses the Gaussian mixture model for patch embedding distribution, with each mixture corresponding to a morphologically distinct prototype. The estimated model parameters are concatenated to form the slide representation, which can be used as input to a predictor module for clinical downstream tasks and visualized as a prototypical assignment map.
  • Figure 3: Prototype-oriented heatmap interpretation. (A) Examples of WSIs and prototypical assignment maps from LUAD and LUSC, with estimated prototype distribution $\hat{\pi}_c$ for each WSI. (B) Prototype distribution and morphological annotations by a board-certified pathologist in the NSCLC cohort. The adenocarcinoma prototypes (C2, C15) and squamous cell carcinoma (C12) appear exclusively in LUAD and LUSC respectively, showing that $\textsc{Panther}$ can correctly capture essential morphological cues in the tissue.
  • Figure S1: Prototype-oriented heatmap interpretation of BLCA. (A) Visualization of prototypical assignment map in an exemplar BLCA H&E WSI, with zoomed-in histopathology ROI of tumor-invading muscle (C2, C8, C11, C16). We show the posterior probability heatmap for the tumor-containing C2 prototype, which has strong concordance with a tumor probability heatmap obtained by a supervised patch-level classifier for BLCA tumor prediction. (B) Prototype distribution $\hat{\pi}_c$ of the exemplar slide. (C) Morphological annotations of all prototypes by a board-certified pathologist in the BLCA cohort.
  • Figure S2: Prototype-oriented heatmap interpretation of BRCA. (A) Visualization of prototypical assignment map in an exemplar BRCA H&E WSI, with zoomed-in histopathology ROI of dense tumor nests (C16) with surrounding connective tissue (C10), adipose tissue (C9) with tumor presence (C3). We show the posterior probability heatmap for the tumor-containing C16 prototype, which has strong concordance with a tumor probability heatmap obtained by a supervised patch-level classifier for BRCA tumor prediction. (B) Prototype distribution $\hat{\pi}_c$ of the exemplar slide. (C) Morphological annotations of all prototypes by a board-certified pathologist in the BRCA cohort.
  • ...and 2 more figures