Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology
Andrew H. Song, Richard J. Chen, Tong Ding, Drew F. K. Williamson, Guillaume Jaume, Faisal Mahmood
TL;DR
Panther introduces an unsupervised, prototype-based framework for summarizing whole-slide image representations in computational pathology by modeling patch embeddings with a Gaussian mixture model and extracting per-prototype statistics as a fixed-length slide embedding. This approach captures morphological heterogeneity through a compact set of prototypes, enabling strong performance on subtyping and survival tasks while also providing interpretable prototype assignment maps. Panther demonstrates competitive or superior results relative to both unsupervised and supervised MIL baselines across 13 datasets, and its interpretability reveals morphologically meaningful tissue concepts such as tumor, stroma, and immune patterns. The work advances task-agnostic slide representation learning with a scalable, interpretable framework that can generalize across histology domains and downstream analyses.
Abstract
Representation learning of pathology whole-slide images (WSIs) has been has primarily relied on weak supervision with Multiple Instance Learning (MIL). However, the slide representations resulting from this approach are highly tailored to specific clinical tasks, which limits their expressivity and generalization, particularly in scenarios with limited data. Instead, we hypothesize that morphological redundancy in tissue can be leveraged to build a task-agnostic slide representation in an unsupervised fashion. To this end, we introduce PANTHER, a prototype-based approach rooted in the Gaussian mixture model that summarizes the set of WSI patches into a much smaller set of morphological prototypes. Specifically, each patch is assumed to have been generated from a mixture distribution, where each mixture component represents a morphological exemplar. Utilizing the estimated mixture parameters, we then construct a compact slide representation that can be readily used for a wide range of downstream tasks. By performing an extensive evaluation of PANTHER on subtyping and survival tasks using 13 datasets, we show that 1) PANTHER outperforms or is on par with supervised MIL baselines and 2) the analysis of morphological prototypes brings new qualitative and quantitative insights into model interpretability.
