Table of Contents
Fetching ...

Advances in Multiple Instance Learning for Whole Slide Image Analysis: Techniques, Challenges, and Future Directions

Jun Wang, Yu Mao, Nan Guan, Chun Jason Xue

TL;DR

This survey analyzes how multiple instance learning (MIL) is leveraged to address whole slide image (WSI) analysis in computational pathology. It covers core MIL architectures, including attention-based pooling (ABMIL), clustering-guided methods (CLAM), dual-stream and channel/spatial attention variants, graph neural networks, transformers, and end-to-end training strategies, with emphasis on scalability to gigapixel WSIs. The review distinguishes automation tasks (cancer detection, grading, and subtyping) from discovery tasks (predicting mutations, gene expression, MSI, and tumor origin), highlighting multimodal and multi-scale embeddings, and the interpretability afforded by attention heatmaps. It discusses current challenges such as data scarcity, rare diseases, and distribution shifts, and outlines future directions toward robust, interpretable, and workflow-efficient MIL frameworks for clinical deployment.

Abstract

Whole slide images (WSIs) are gigapixel-scale digital images of H\&E-stained tissue samples widely used in pathology. The substantial size and complexity of WSIs pose unique analytical challenges. Multiple Instance Learning (MIL) has emerged as a powerful approach for addressing these challenges, particularly in cancer classification and detection. This survey provides a comprehensive overview of the challenges and methodologies associated with applying MIL to WSI analysis, including attention mechanisms, pseudo-labeling, transformers, pooling functions, and graph neural networks. Additionally, it explores the potential of MIL in discovering cancer cell morphology, constructing interpretable machine learning models, and quantifying cancer grading. By summarizing the current challenges, methodologies, and potential applications of MIL in WSI analysis, this survey aims to inform researchers about the state of the field and inspire future research directions.

Advances in Multiple Instance Learning for Whole Slide Image Analysis: Techniques, Challenges, and Future Directions

TL;DR

This survey analyzes how multiple instance learning (MIL) is leveraged to address whole slide image (WSI) analysis in computational pathology. It covers core MIL architectures, including attention-based pooling (ABMIL), clustering-guided methods (CLAM), dual-stream and channel/spatial attention variants, graph neural networks, transformers, and end-to-end training strategies, with emphasis on scalability to gigapixel WSIs. The review distinguishes automation tasks (cancer detection, grading, and subtyping) from discovery tasks (predicting mutations, gene expression, MSI, and tumor origin), highlighting multimodal and multi-scale embeddings, and the interpretability afforded by attention heatmaps. It discusses current challenges such as data scarcity, rare diseases, and distribution shifts, and outlines future directions toward robust, interpretable, and workflow-efficient MIL frameworks for clinical deployment.

Abstract

Whole slide images (WSIs) are gigapixel-scale digital images of H\&E-stained tissue samples widely used in pathology. The substantial size and complexity of WSIs pose unique analytical challenges. Multiple Instance Learning (MIL) has emerged as a powerful approach for addressing these challenges, particularly in cancer classification and detection. This survey provides a comprehensive overview of the challenges and methodologies associated with applying MIL to WSI analysis, including attention mechanisms, pseudo-labeling, transformers, pooling functions, and graph neural networks. Additionally, it explores the potential of MIL in discovering cancer cell morphology, constructing interpretable machine learning models, and quantifying cancer grading. By summarizing the current challenges, methodologies, and potential applications of MIL in WSI analysis, this survey aims to inform researchers about the state of the field and inspire future research directions.
Paper Structure (43 sections, 10 equations, 2 figures, 2 tables)

This paper contains 43 sections, 10 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: General workflow and tasks in Cpath. Original WSIs and pre-processing is from lu2021data
  • Figure 2: Features in Cpath. hiptlipkova2022artificial