Cross-scale Multi-instance Learning for Pathological Image Diagnosis
Ruining Deng, Can Cui, Lucas W. Remedios, Shunxing Bao, R. Michael Womick, Sophie Chiron, Jia Li, Joseph T. Roland, Ken S. Lau, Qi Liu, Keith T. Wilson, Yaohong Wang, Lori A. Coburn, Bennett A. Landman, Yuankai Huo
TL;DR
The paper tackles the problem of diagnosing pathology from gigapixel whole-slide images by leveraging information across multiple scales. It introduces CS-MIL, a cross-scale attention framework that learns scale-specific phenotype embeddings, processes them through a shared multi-scale encoder, and fuses them with attention to form a cross-scale representation $F_{cs} = \sum_{s=1}^S a_s f_s$ for MIL-based slide classification. Key contributions include a novel cross-scale MIL algorithm, a toy dataset to visualize cross-scale attention, and state-of-the-art performance on both in-house Crohn's disease data and TCGA-GBMLGG with an emphasis on interpretability through cross-scale attention maps. The approach advances digital pathology by enabling interpretable, scale-aware diagnosis and provides a reproducible pipeline with a publicly available implementation.
Abstract
Analyzing high resolution whole slide images (WSIs) with regard to information across multiple scales poses a significant challenge in digital pathology. Multi-instance learning (MIL) is a common solution for working with high resolution images by classifying bags of objects (i.e. sets of smaller image patches). However, such processing is typically performed at a single scale (e.g., 20x magnification) of WSIs, disregarding the vital inter-scale information that is key to diagnoses by human pathologists. In this study, we propose a novel cross-scale MIL algorithm to explicitly aggregate inter-scale relationships into a single MIL network for pathological image diagnosis. The contribution of this paper is three-fold: (1) A novel cross-scale MIL (CS-MIL) algorithm that integrates the multi-scale information and the inter-scale relationships is proposed; (2) A toy dataset with scale-specific morphological features is created and released to examine and visualize differential cross-scale attention; (3) Superior performance on both in-house and public datasets is demonstrated by our simple cross-scale MIL strategy. The official implementation is publicly available at https://github.com/hrlblab/CS-MIL.
