Explaining Digital Pathology Models via Clustering Activations
Adam Bajger, Jan Obdržálek, Vojtěch Kůr, Rudolf Nenutil, Petr Holub, Vít Musil, Tomáš Brázdil
TL;DR
This work tackles the challenge of explainability in digital pathology by moving beyond local saliency maps to a clustering-based approach that reveals global model behavior. By applying non-negative matrix factorization to CNN activations, the method assigns tissue regions to interpretable clusters, producing per-pixel heatmaps that reflect how the model processes whole slides. The authors validate the approach on a prostate cancer model, showing that clusters align with meaningful morphological features and correlate with GradCAM while offering richer, more nuanced explanations; quantitative metrics demonstrate strong cancer-prediction performance using cluster weights. The technique enhances trust, supports faster clinical adoption, and is extensible to other architectures, including transformers, across digital pathology and potentially other spatial-domain AI tasks.
Abstract
We present a clustering-based explainability technique for digital pathology models based on convolutional neural networks. Unlike commonly used methods based on saliency maps, such as occlusion, GradCAM, or relevance propagation, which highlight regions that contribute the most to the prediction for a single slide, our method shows the global behaviour of the model under consideration, while also providing more fine-grained information. The result clusters can be visualised not only to understand the model, but also to increase confidence in its operation, leading to faster adoption in clinical practice. We also evaluate the performance of our technique on an existing model for detecting prostate cancer, demonstrating its usefulness.
