Table of Contents
Fetching ...

Cross-scale Multi-instance Learning for Pathological Image Diagnosis

Ruining Deng, Can Cui, Lucas W. Remedios, Shunxing Bao, R. Michael Womick, Sophie Chiron, Jia Li, Joseph T. Roland, Ken S. Lau, Qi Liu, Keith T. Wilson, Yaohong Wang, Lori A. Coburn, Bennett A. Landman, Yuankai Huo

TL;DR

The paper tackles the problem of diagnosing pathology from gigapixel whole-slide images by leveraging information across multiple scales. It introduces CS-MIL, a cross-scale attention framework that learns scale-specific phenotype embeddings, processes them through a shared multi-scale encoder, and fuses them with attention to form a cross-scale representation $F_{cs} = \sum_{s=1}^S a_s f_s$ for MIL-based slide classification. Key contributions include a novel cross-scale MIL algorithm, a toy dataset to visualize cross-scale attention, and state-of-the-art performance on both in-house Crohn's disease data and TCGA-GBMLGG with an emphasis on interpretability through cross-scale attention maps. The approach advances digital pathology by enabling interpretable, scale-aware diagnosis and provides a reproducible pipeline with a publicly available implementation.

Abstract

Analyzing high resolution whole slide images (WSIs) with regard to information across multiple scales poses a significant challenge in digital pathology. Multi-instance learning (MIL) is a common solution for working with high resolution images by classifying bags of objects (i.e. sets of smaller image patches). However, such processing is typically performed at a single scale (e.g., 20x magnification) of WSIs, disregarding the vital inter-scale information that is key to diagnoses by human pathologists. In this study, we propose a novel cross-scale MIL algorithm to explicitly aggregate inter-scale relationships into a single MIL network for pathological image diagnosis. The contribution of this paper is three-fold: (1) A novel cross-scale MIL (CS-MIL) algorithm that integrates the multi-scale information and the inter-scale relationships is proposed; (2) A toy dataset with scale-specific morphological features is created and released to examine and visualize differential cross-scale attention; (3) Superior performance on both in-house and public datasets is demonstrated by our simple cross-scale MIL strategy. The official implementation is publicly available at https://github.com/hrlblab/CS-MIL.

Cross-scale Multi-instance Learning for Pathological Image Diagnosis

TL;DR

The paper tackles the problem of diagnosing pathology from gigapixel whole-slide images by leveraging information across multiple scales. It introduces CS-MIL, a cross-scale attention framework that learns scale-specific phenotype embeddings, processes them through a shared multi-scale encoder, and fuses them with attention to form a cross-scale representation for MIL-based slide classification. Key contributions include a novel cross-scale MIL algorithm, a toy dataset to visualize cross-scale attention, and state-of-the-art performance on both in-house Crohn's disease data and TCGA-GBMLGG with an emphasis on interpretability through cross-scale attention maps. The approach advances digital pathology by enabling interpretable, scale-aware diagnosis and provides a reproducible pipeline with a publicly available implementation.

Abstract

Analyzing high resolution whole slide images (WSIs) with regard to information across multiple scales poses a significant challenge in digital pathology. Multi-instance learning (MIL) is a common solution for working with high resolution images by classifying bags of objects (i.e. sets of smaller image patches). However, such processing is typically performed at a single scale (e.g., 20x magnification) of WSIs, disregarding the vital inter-scale information that is key to diagnoses by human pathologists. In this study, we propose a novel cross-scale MIL algorithm to explicitly aggregate inter-scale relationships into a single MIL network for pathological image diagnosis. The contribution of this paper is three-fold: (1) A novel cross-scale MIL (CS-MIL) algorithm that integrates the multi-scale information and the inter-scale relationships is proposed; (2) A toy dataset with scale-specific morphological features is created and released to examine and visualize differential cross-scale attention; (3) Superior performance on both in-house and public datasets is demonstrated by our simple cross-scale MIL strategy. The official implementation is publicly available at https://github.com/hrlblab/CS-MIL.
Paper Structure (20 sections, 6 equations, 6 figures, 6 tables)

This paper contains 20 sections, 6 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Multi-scale awareness. Given the heterogeneous structural patterns in tissue samples at different resolutions, human pathologists need to carefully examine biopsies at multiple scales across a whole slide image to capture morphological patterns for disease diagnosis.
  • Figure 2: Multi-scale MIL designs. a. Previous work did not take into account the inter-scale relationships across different resolutions. b. Our solution enables the identification of significant regions using cross-scale attention maps, and aggregates the cross-scale features into a cross-scale representation by multiplying the cross-scale attention scores for diagnosing pathological images. c. The cross-scale attention mechanism is employed to merge the cross-scale features with different attention scores. Cross-scale representations from various clusters are concatenated for pathological classification.
  • Figure 3: ROC curves with AUC scores and PR curves with AP scores. This figure illustrates the receiver operating characteristic (ROC) curves and precision-recall (PR) curves for both baseline models and the proposed model, along with the corresponding area under the curve (AUC) scores and average precision (AP) scores. The results indicate that the proposed model with cross-scale attention outperformed the baseline models in terms of both metrics.
  • Figure 4: Attention Map Visualization. This figure displays the cross-scale attention maps generated by the proposed model for a CD WSI. The attention map at 20$\times$ highlights the chronic inflammatory infiltrates, whereas the attention map at 10$\times$ focuses on the crypt structures. These regions of interest indicate the distinctive areas for CD diagnosis that are discernible across multiple scales.
  • Figure 5: Two toy dataset. This figure demonstrates two toy datasets to evaluate the functionality of the cross-scale attention mechanism. In the micro-anomaly dataset, the white cross pattern is only observed at 20$\times$. In the macro-anomaly dataset, the abnormal shape (ellipse) is easily recognized at 5$\times$.
  • ...and 1 more figures