Multiple Instance Learning for Glioma Diagnosis using Hematoxylin and Eosin Whole Slide Images: An Indian Cohort Study

Ekansh Chauhan; Amit Sharma; Megha S Uppin; C. V. Jawahar; P. K. Vinod

Multiple Instance Learning for Glioma Diagnosis using Hematoxylin and Eosin Whole Slide Images: An Indian Cohort Study

Ekansh Chauhan, Amit Sharma, Megha S Uppin, C. V. Jawahar, P. K. Vinod

TL;DR

The paper tackles accurate glioma typing, grading, and IHC biomarker inference from H&E whole-slide images by formulating a MIL-based pipeline. It introduces the IPD-Brain Indian cohort and systematically evaluates combinations of patch-level feature extractors and MIL aggregators, finding that a ResNet-50 backbone pre-trained with Barrow Twins SSL and the Double-Tier Feature Distillation (DTFD) aggregator yields state-of-the-art performance on IPD-Brain and TCGA-Brain datasets. The approach demonstrates high AUCs for multi-class glioma subtype classification, reliable grading performance, and strong ability to predict IHC biomarkers (IDH, ATRX, TP53) and Ki-67 from H&E, with explainability via attention maps aligning with pathologist diagnostic reasoning. Importantly, the model operates on H&E slides alone, offering a cost-effective augmentation to molecular testing and potential applicability across diverse patient populations, highlighted by the newly established IPD-Brain resource. The work paves the way for broader deployment of MIL-based histopathology tools in neuro-oncology and encourages further exploration of extractor-aggregator pairings.

Abstract

The effective management of brain tumors relies on precise typing, subtyping, and grading. This study advances patient care with findings from rigorous multiple instance learning experimentations across various feature extractors and aggregators in brain tumor histopathology. It establishes new performance benchmarks in glioma subtype classification across multiple datasets, including a novel dataset focused on the Indian demographic (IPD- Brain), providing a valuable resource for existing research. Using a ResNet-50, pretrained on histopathology datasets for feature extraction, combined with the Double-Tier Feature Distillation (DTFD) feature aggregator, our approach achieves state-of-the-art AUCs of 88.08 on IPD-Brain and 95.81 on the TCGA-Brain dataset, respectively, for three-way glioma subtype classification. Moreover, it establishes new benchmarks in grading and detecting IHC molecular biomarkers (IDH1R132H, TP53, ATRX, Ki-67) through H&E stained whole slide images for the IPD-Brain dataset. The work also highlights a significant correlation between the model decision-making processes and the diagnostic reasoning of pathologists, underscoring its capability to mimic professional diagnostic procedures.

Multiple Instance Learning for Glioma Diagnosis using Hematoxylin and Eosin Whole Slide Images: An Indian Cohort Study

TL;DR

Abstract

Paper Structure (19 sections, 4 equations, 5 figures, 6 tables)

This paper contains 19 sections, 4 equations, 5 figures, 6 tables.

Introduction
Dataset Description
Methodology
MIL Problem formulation
WSI Preprocessing
Feature Extractors (f)
MIL Aggregation Models (g):
CLAM
DSMIL
DTFD
Transfer Learning for IHC Biomarkers
Experimentation
Experiment Setup
Evaluation Metrics
Implementation Details
...and 4 more sections

Figures (5)

Figure 1: Multi-Instance Learning Framework for Subtype Classification in Brain Histopathology. Tissue from surgical samples is digitized and processed into patches, and patch-level features are extracted using a pre-trained feature extractor network. Subsequently, the feature aggregation method pools the patch-level representation into slide-level representation, which is then classified to determine tumor type, grade, and molecular markers. Then, attention mapping is used for interpretability, providing critical insights into the neural network's decision-making process.
Figure 2: Distribution of phenotypic information and labels in IPD-Brain.
Figure 3: IPD-Brain Dataset sample at various magnifications demonstrate the enhanced detail from staining and digitization, crucial for deep learning analysis.
Figure 4: Comparative Analysis of Feature Extractors and Aggregation Methods. The box plots illustrate the 10-fold AUC and macro F1 scores distribution for all feature extractor and aggregation combinations used. Mean values are indicated by dots within each box, whereas lines represent medians.
Figure 5: Attention heatmap visualization for brain tumor WSI classification, with green and red dots indicating correct classifications and misclassifications, respectively, across all feature aggregators.

Multiple Instance Learning for Glioma Diagnosis using Hematoxylin and Eosin Whole Slide Images: An Indian Cohort Study

TL;DR

Abstract

Multiple Instance Learning for Glioma Diagnosis using Hematoxylin and Eosin Whole Slide Images: An Indian Cohort Study

Authors

TL;DR

Abstract

Table of Contents

Figures (5)