Table of Contents
Fetching ...

BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology

Amaya Gallagher-Syed, Henry Senior, Omnia Alwazzan, Elena Pontarini, Michele Bombardieri, Costantino Pitzalis, Myles J. Lewis, Michael R. Barnes, Luca Rossi, Gregory Slabaugh

TL;DR

BioX-CPath addresses the need for interpretable multistain IHC analysis by presenting a biologically-grounded graph neural network that fuses semantic and spatial cues across stains. The core innovation, Stain-Aware Attention Pooling (SAAP), generates stain-aware patient embeddings and enables rich interpretability through SAAP scores, entropy measures, stain-stain interaction metrics, and GNN heatmaps, aided by random walk positional encodings for long-range context. On Rheumatoid Arthritis and Sjogren's datasets, BioX-CPath achieves state-of-the-art accuracy and provides mechanistic insights that align with established pathology, demonstrating its potential for clinical deployment. This work strengthens the bridge between high-performance computational pathology and actionable biological understanding, while supplying open-source resources for further development.

Abstract

The development of biologically interpretable and explainable models remains a key challenge in computational pathology, particularly for multistain immunohistochemistry (IHC) analysis. We present BioX-CPath, an explainable graph neural network architecture for whole slide image (WSI) classification that leverages both spatial and semantic features across multiple stains. At its core, BioX-CPath introduces a novel Stain-Aware Attention Pooling (SAAP) module that generates biologically meaningful, stain-aware patient embeddings. Our approach achieves state-of-the-art performance on both Rheumatoid Arthritis and Sjogren's Disease multistain datasets. Beyond performance metrics, BioX-CPath provides interpretable insights through stain attention scores, entropy measures, and stain interaction scores, that permit measuring model alignment with known pathological mechanisms. This biological grounding, combined with strong classification performance, makes BioX-CPath particularly suitable for clinical applications where interpretability is key. Source code and documentation can be found at: https://github.com/AmayaGS/BioX-CPath.

BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology

TL;DR

BioX-CPath addresses the need for interpretable multistain IHC analysis by presenting a biologically-grounded graph neural network that fuses semantic and spatial cues across stains. The core innovation, Stain-Aware Attention Pooling (SAAP), generates stain-aware patient embeddings and enables rich interpretability through SAAP scores, entropy measures, stain-stain interaction metrics, and GNN heatmaps, aided by random walk positional encodings for long-range context. On Rheumatoid Arthritis and Sjogren's datasets, BioX-CPath achieves state-of-the-art accuracy and provides mechanistic insights that align with established pathology, demonstrating its potential for clinical deployment. This work strengthens the bridge between high-performance computational pathology and actionable biological understanding, while supplying open-source resources for further development.

Abstract

The development of biologically interpretable and explainable models remains a key challenge in computational pathology, particularly for multistain immunohistochemistry (IHC) analysis. We present BioX-CPath, an explainable graph neural network architecture for whole slide image (WSI) classification that leverages both spatial and semantic features across multiple stains. At its core, BioX-CPath introduces a novel Stain-Aware Attention Pooling (SAAP) module that generates biologically meaningful, stain-aware patient embeddings. Our approach achieves state-of-the-art performance on both Rheumatoid Arthritis and Sjogren's Disease multistain datasets. Beyond performance metrics, BioX-CPath provides interpretable insights through stain attention scores, entropy measures, and stain interaction scores, that permit measuring model alignment with known pathological mechanisms. This biological grounding, combined with strong classification performance, makes BioX-CPath particularly suitable for clinical applications where interpretability is key. Source code and documentation can be found at: https://github.com/AmayaGS/BioX-CPath.

Paper Structure

This paper contains 44 sections, 5 equations, 12 figures, 6 tables.

Figures (12)

  • Figure 1: Architecture: Our approach begins by preprocessing the WSIs into patch features using UNI Chen2024a (Section A). The resultant features are combined into two graphs, $G_{FS}$ and $G_{RA}$, representing the feature space similarity and region adjacency respectively. Given that the node sets of the two graphs are shared, we join the edge sets together, yielding graph $G_{FRA}$ (Section B). $G_{FRA}$ is then passed through hierarchical GNN blocks (Section C) consisting of a Graph Attention Network (GAT) Velickovic2018 and our proposed Stain-Aware Attention Pooling (SAAP) (detailed in the top right), which updates the node features while selecting the most relevant ones using an importance score. We obtain stain-aware patient encoding, which we pass through a final MHSA layer, before classification. Derived from both SAAP and GAT layers we propose metrics which provide biological insights into the model's predictions (Section D).
  • Figure 2: RA Dataset Explainability: The top row shows box plots of the SAAP scores distribution for different stain types (H&E, CD68, CD138, and CD20) for each classification label in the RA dataset (Pauci-Immune and Lymphoid/Myeloid). The bottom row shows the entropy score distributions for each of the stain types according to the classification label.
  • Figure 3: Sjogren Dataset Explainability: The top row shows box plots of the SAAP scores for different stain types (HE, CD3, CD138, CD20, and CD21) for each classification label in the Sjogren dataset (Sicca and Sjogren). The bottom row shows the entropy score distributions for each of the stain types according to the classification label.
  • Figure 4: Example of low inflammatory vs high inflammatory pathotype presentation in H&E and IHC stains for RA: Rheumatoid Arthritis inflammatory pathotypes based on semi-quantitative analysis of synovial tissue biopsies stained with H&E, CD20+ B cells, CD68+ macrophages and IHC+ CD138 plasma cells.
  • Figure 5: Example of Sicca vs Sjogren presentation in H&E and IHC stains: On bottom, a patient diagnosed with Sicca, on top a patient diagnosed with Sjogren's Disease. Here we show samples stained with IHC stains CD3+ T cells, CD20+ B cells, and CD138+ plasma cells, as well as CD21+ in the case of Sjogren's Disease.
  • ...and 7 more figures