Table of Contents
Fetching ...

A Graph-Based Framework for Interpretable Whole Slide Image Analysis

Alexander Weers, Alexander H. Berger, Laurin Lux, Peter Schüffler, Daniel Rueckert, Johannes C. Paetzold

TL;DR

This work introduces a graph-based framework for whole-slide image analysis that preserves biological tissue structure by constructing adaptive region graphs from superpixels, rather than relying on fixed grids. It combines embedding-guided graph coarsening with a rich, interpretable feature set (texture, morphology, and nuclear characteristics) and a Graph Attention Network to perform cancer staging and survival predictions with strong accuracy and significantly reduced data and parameter requirements compared to foundation models. Interpretability is baked in via integrated gradients that map predictions to specific tissue regions and clinically meaningful nuclear features, addressing trust and adoption barriers in clinical workflows. The approach achieves competitive results with far less data and computation, highlighting a practical, scalable path for interpretable computational pathology.

Abstract

The histopathological analysis of whole-slide images (WSIs) is fundamental to cancer diagnosis but is a time-consuming and expert-driven process. While deep learning methods show promising results, dominant patch-based methods artificially fragment tissue, ignore biological boundaries, and produce black-box predictions. We overcome these limitations with a novel framework that transforms gigapixel WSIs into biologically-informed graph representations and is interpretable by design. Our approach builds graph nodes from tissue regions that respect natural structures, not arbitrary grids. We introduce an adaptive graph coarsening technique, guided by learned embeddings, to efficiently merge homogeneous regions while preserving diagnostically critical details in heterogeneous areas. Each node is enriched with a compact, interpretable feature set capturing clinically-motivated priors. A graph attention network then performs diagnosis on this compact representation. We demonstrate strong performance on challenging cancer staging and survival prediction tasks. Crucially, our resource-efficient model ($>$13x fewer parameters and $>$300x less data) achieves results competitive with a massive foundation model, while offering full interpretability through feature attribution. Our code is publicly available at https://github.com/HistoGraph31/pix2pathology.

A Graph-Based Framework for Interpretable Whole Slide Image Analysis

TL;DR

This work introduces a graph-based framework for whole-slide image analysis that preserves biological tissue structure by constructing adaptive region graphs from superpixels, rather than relying on fixed grids. It combines embedding-guided graph coarsening with a rich, interpretable feature set (texture, morphology, and nuclear characteristics) and a Graph Attention Network to perform cancer staging and survival predictions with strong accuracy and significantly reduced data and parameter requirements compared to foundation models. Interpretability is baked in via integrated gradients that map predictions to specific tissue regions and clinically meaningful nuclear features, addressing trust and adoption barriers in clinical workflows. The approach achieves competitive results with far less data and computation, highlighting a practical, scalable path for interpretable computational pathology.

Abstract

The histopathological analysis of whole-slide images (WSIs) is fundamental to cancer diagnosis but is a time-consuming and expert-driven process. While deep learning methods show promising results, dominant patch-based methods artificially fragment tissue, ignore biological boundaries, and produce black-box predictions. We overcome these limitations with a novel framework that transforms gigapixel WSIs into biologically-informed graph representations and is interpretable by design. Our approach builds graph nodes from tissue regions that respect natural structures, not arbitrary grids. We introduce an adaptive graph coarsening technique, guided by learned embeddings, to efficiently merge homogeneous regions while preserving diagnostically critical details in heterogeneous areas. Each node is enriched with a compact, interpretable feature set capturing clinically-motivated priors. A graph attention network then performs diagnosis on this compact representation. We demonstrate strong performance on challenging cancer staging and survival prediction tasks. Crucially, our resource-efficient model (13x fewer parameters and 300x less data) achieves results competitive with a massive foundation model, while offering full interpretability through feature attribution. Our code is publicly available at https://github.com/HistoGraph31/pix2pathology.

Paper Structure

This paper contains 13 sections, 3 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Overview of our method for transforming WSIs into graph representations aligning with natural tissue borders. Starting from a WSI, superpixels are segmented by clustering pixels based on color and spatial proximity (1) to form a coarse region adjacency (RA) graph $\mathcal{G}_0$ (2). Region embeddings are obtained using a contrastively pretrained CNN (3), enabling similarity-based merging of adjacent nodes (4). Interpretable features (texture, morphological, and nuclear characteristics) enrich each node, resulting in a compact yet information-rich representation (6). Using this graph, we train a graph attention network for the different diagnostic tasks (7). Through integrated gradients, the model's predictions can be attributed to regions and features (8).
  • Figure 2: Illustration of fine-grained, tissue-adaptive segmentation. Superpixel boundaries, highlighted in yellow, are overlaid on a tissue micrograph to demonstrate their precise alignment with the underlying morphological structures.
  • Figure 3:
  • Figure 4: Top left: Original WSI; top right: important regions are highlighted; bottom left: zoomed in version of highlighted region; bottom right: feature attribution scores.
  • Figure 5: Top left: Original WSI; top right: important regions are highlighted; bottom left: zoomed in version of highlighted region; bottom right: feature attribution scores.