Table of Contents
Fetching ...

The Role of Graph-based MIL and Interventional Training in the Generalization of WSI Classifiers

Rita Pereira, M. Rita Verdelho, Catarina Barata, Carlos Santiago

TL;DR

WSI cancer classification is challenged by gigapixel scales and scarce patch-level labels, compounded by domain shifts across centers and scanners. The authors propose GMIL-IT, a Graph-based MIL framework that leverages patch-, region-, or centroid-based graph representations, graph neural networks, and MIL pooling, augmented by backdoor-adjusted interventional training through a confounder dictionary. A thorough set of experiments on Camelyon16/17 shows that graph-based representations alone yield strong generalization under domain shifts, while interventional training may not always improve performance; patch-graphs with GAT-based MIL (PatchGAT-ABMIL) provide the best results. The work provides a practical, robust approach for WSI classification and offers code to enable replication and broader evaluation across cancer types and datasets.

Abstract

Whole Slide Imaging (WSI), which involves high-resolution digital scans of pathology slides, has become the gold standard for cancer diagnosis, but its gigapixel resolution and the scarcity of annotated datasets present challenges for deep learning models. Multiple Instance Learning (MIL), a widely-used weakly supervised approach, bypasses the need for patch-level annotations. However, conventional MIL methods overlook the spatial relationships between patches, which are crucial for tasks such as cancer grading and diagnosis. To address this, graph-based approaches have gained prominence by incorporating spatial information through node connections. Despite their potential, both MIL and graph-based models are vulnerable to learning spurious associations, like color variations in WSIs, affecting their robustness. In this dissertation, we conduct an extensive comparison of multiple graph construction techniques, MIL models, graph-MIL approaches, and interventional training, introducing a new framework, Graph-based Multiple Instance Learning with Interventional Training (GMIL-IT), for WSI classification. We evaluate their impact on model generalization through domain shift analysis and demonstrate that graph-based models alone achieve the generalization initially anticipated from interventional training. Our code is available here: github.com/ritamartinspereira/GMIL-IT

The Role of Graph-based MIL and Interventional Training in the Generalization of WSI Classifiers

TL;DR

WSI cancer classification is challenged by gigapixel scales and scarce patch-level labels, compounded by domain shifts across centers and scanners. The authors propose GMIL-IT, a Graph-based MIL framework that leverages patch-, region-, or centroid-based graph representations, graph neural networks, and MIL pooling, augmented by backdoor-adjusted interventional training through a confounder dictionary. A thorough set of experiments on Camelyon16/17 shows that graph-based representations alone yield strong generalization under domain shifts, while interventional training may not always improve performance; patch-graphs with GAT-based MIL (PatchGAT-ABMIL) provide the best results. The work provides a practical, robust approach for WSI classification and offers code to enable replication and broader evaluation across cancer types and datasets.

Abstract

Whole Slide Imaging (WSI), which involves high-resolution digital scans of pathology slides, has become the gold standard for cancer diagnosis, but its gigapixel resolution and the scarcity of annotated datasets present challenges for deep learning models. Multiple Instance Learning (MIL), a widely-used weakly supervised approach, bypasses the need for patch-level annotations. However, conventional MIL methods overlook the spatial relationships between patches, which are crucial for tasks such as cancer grading and diagnosis. To address this, graph-based approaches have gained prominence by incorporating spatial information through node connections. Despite their potential, both MIL and graph-based models are vulnerable to learning spurious associations, like color variations in WSIs, affecting their robustness. In this dissertation, we conduct an extensive comparison of multiple graph construction techniques, MIL models, graph-MIL approaches, and interventional training, introducing a new framework, Graph-based Multiple Instance Learning with Interventional Training (GMIL-IT), for WSI classification. We evaluate their impact on model generalization through domain shift analysis and demonstrate that graph-based models alone achieve the generalization initially anticipated from interventional training. Our code is available here: github.com/ritamartinspereira/GMIL-IT

Paper Structure

This paper contains 32 sections, 9 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: General Pipeline of GMIL-IT. The process begins with a feature extractor and graph construction module \ref{['subsec:FeatGraph']}. A GNN model is then used to generate spatially aware instance representations, with a MIL aggregator creating the bag embedding and a classifier determining the label \ref{['subsec:GraphMIL']}. Bag-level clustering is performed to build a confounder dictionary, which is concatenated to the bag embeddings during the Interventional Training stage \ref{['subsec:IT']}.
  • Figure 2: Performance Comparison of Baseline and GAT-based MIL Models Across Camelyon16 (left) and Camelyon17 (right) Datasets.
  • Figure 3: Performance Comparison of Baseline and GCN-based MIL Models Across Camelyon16 (left) and Camelyon17 (right) Datasets.
  • Figure 4: t-SNE visualization of bag embeddings from Camelyon16, comparing ABMIL (left) and GAT-ABMIL (right) for confounder dictionary construction.
  • Figure 5: t-SNE visualization of bag embeddings from Camelyon17, comparing GAT-ABMIL fold 0 (left) and GAT-ABMIL fold 2 (right) for confounder dictionary construction.
  • ...and 1 more figures