GRAPHITE: Graph-Based Interpretable Tissue Examination for Enhanced Explainability in Breast Cancer Histopathology

Raktim Kumar Mondol; Ewan K. A. Millar; Peter H. Graham; Lois Browne; Arcot Sowmya; Erik Meijering

GRAPHITE: Graph-Based Interpretable Tissue Examination for Enhanced Explainability in Breast Cancer Histopathology

Raktim Kumar Mondol, Ewan K. A. Millar, Peter H. Graham, Lois Browne, Arcot Sowmya, Erik Meijering

TL;DR

GRAPHITE (Graph-based Interpretable Tissue Examination), a post-hoc explainable framework designed for breast cancer tissue microarray (TMA) analysis, is introduced, providing interpretable visualisations that align with the pathologists' diagnostic reasoning and support precision medicine.

Abstract

Explainable AI (XAI) in medical histopathology is essential for enhancing the interpretability and clinical trustworthiness of deep learning models in cancer diagnosis. However, the black-box nature of these models often limits their clinical adoption. We introduce GRAPHITE (Graph-based Interpretable Tissue Examination), a post-hoc explainable framework designed for breast cancer tissue microarray (TMA) analysis. GRAPHITE employs a multiscale approach, extracting patches at various magnification levels, constructing an hierarchical graph, and utilising graph attention networks (GAT) with scalewise attention (SAN) to capture scale-dependent features. We trained the model on 140 tumour TMA cores and four benign whole slide images from which 140 benign samples were created, and tested it on 53 pathologist-annotated TMA samples. GRAPHITE outperformed traditional XAI methods, achieving a mean average precision (mAP) of 0.56, an area under the receiver operating characteristic curve (AUROC) of 0.94, and a threshold robustness (ThR) of 0.70, indicating that the model maintains high performance across a wide range of thresholds. In clinical utility, GRAPHITE achieved the highest area under the decision curve (AUDC) of 4.17e+5, indicating reliable decision support across thresholds. These results highlight GRAPHITE's potential as a clinically valuable tool in computational pathology, providing interpretable visualisations that align with the pathologists' diagnostic reasoning and support precision medicine.

GRAPHITE: Graph-Based Interpretable Tissue Examination for Enhanced Explainability in Breast Cancer Histopathology

TL;DR

Abstract

Paper Structure (42 sections, 35 equations, 10 figures, 1 table)

This paper contains 42 sections, 35 equations, 10 figures, 1 table.

Introduction
Literature Review
Materials and Methods
Data Collection
Data Preparation
Slide Annotation
Image Preprocessing
Proposed Methods
Stage 1: MIL-Based Classification
Patch Extraction
Feature Extraction With ResNet18
Patch-Level Aggregation Using Attention Mechanism
Core-Level Classification and Patient Projector
Stage 2: GRAPHITE-Based Visualisation
Multiscale Patch Extraction
...and 27 more sections

Figures (10)

Figure 1: Architecture of the TMA classification model (tumour versus normal). The pipeline begins with a TMA slide, which contains multiple circular TMA cores. Each core is divided into $N$ patches, each $224 \times 224 \times 3$ pixels. The patches from each core are processed using a ResNet18 model pretrained on the NCT-CRC-HE-100K dataset with 100,000 non-overlapping patches from H&E-stained colorectal cancer and normal tissues spanning 9 classes. The attention module aggregates patch-level information at the core level, which is passed to the patient projector. The final classification is made through a dense layer classifier, distinguishing between tumour and normal tissue.
Figure 2: Multiscale hierarchical model for saliency mapping in TMA analysis. Core patches are processed at three magnification levels: Level 0 (40$\times$, 0.25 µ m/pixel), Level 1 (20$\times$, 0.50 µ m/pixel) and Level 2 (10$\times$, 1 µ m/pixel) with 224$\times$224-pixel patches. These are passed through a fine-tuned ResNet18 for feature extraction and structured into an hierarchical graph, where edges represent spatial relationships. A graph attention network (GAT) applies attention weights, and a scalewise attention network (SAN) integrates multiscale information. Finally, a saliency map is computed through multilevel weighted and confidence-based fusion, highlighting key tumour regions for diagnostic interpretation.
Figure 3: Visualisation pipeline of GRAPHITE for breast cancer TMA analysis of two sample cores. Row 1: The actual tissue cores. Row 2: The attention maps across three levels (Level 0, Level 1, and Level 2), representing multiscale analysis with different magnification levels. Row 3: The attention maps are fused through a multilevel fusion process, after which attention maps are generated by FullGrad and MIL methods, which are integrated to enhance feature interpretability. Row 4: Confidence-based optimal fusion combines the multilevel attention maps, creating a refined representation of salient regions. The Final Overlay visually aligns this fused attention map with the tissue core, and the Annotated Mask indicates the pathologist-verified cancerous regions.
Figure 4: Receiver operating characteristic (ROC) curves for various XAI methods on breast cancer TMA analysis. AUROC for each method is displayed in the legend, with GRAPHITE-V2 achieving the highest AUROC of 0.94, followed closely by GRAPHITE-V1 and GRAPHITE-Base, both at 0.93. This performance demonstrates the superior discriminative ability of GRAPHITE variants compared to other XAI methods, such as Attention (AUROC = 0.92) and FullGrad (AUROC = 0.91). Model-agnostic methods LIME (AUROC = 0.83) and SHAP (AUROC = 0.61) show lower performance in this context. The colour gradient along the curves represents different threshold values.
Figure 5: Precision-recall (PR) curves for various XAI methods on breast cancer TMA analysis. AUPRC for each method is displayed in the legend, with GRAPHITE-V2 achieving the highest AUPRC of 0.78, followed by GRAPHITE-Base and Attention, both with an AUPRC of 0.77. This performance highlights the superior PR balance of GRAPHITE variants compared to other XAI methods, such as FullGrad (AUPRC = 0.65) and Grad-CAM++ (AUPRC = 0.62). LIME (AUPRC = 0.52) and SHAP (AUPRC = 0.31) exhibit significantly lower precision-recall performance. The colour gradient along the curves represents different threshold values.
...and 5 more figures

GRAPHITE: Graph-Based Interpretable Tissue Examination for Enhanced Explainability in Breast Cancer Histopathology

TL;DR

Abstract

GRAPHITE: Graph-Based Interpretable Tissue Examination for Enhanced Explainability in Breast Cancer Histopathology

Authors

TL;DR

Abstract

Table of Contents

Figures (10)