Table of Contents
Fetching ...

MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models

Zijie Fang, Yifeng Wang, Ye Zhang, Zhi Wang, Jian Zhang, Xiangyang Ji, Yongbing Zhang

TL;DR

This work proposes a framework named MamMIL for WSI analysis by cooperating the selective structured state space model (i.e., Mamba) with MIL, enabling the modeling of global instance dependencies while maintaining linear complexity.

Abstract

Recently, pathological diagnosis has achieved superior performance by combining deep learning models with the multiple instance learning (MIL) framework using whole slide images (WSIs). However, the giga-pixeled nature of WSIs poses a great challenge for efficient MIL. Existing studies either do not consider global dependencies among instances, or use approximations such as linear attentions to model the pair-to-pair instance interactions, which inevitably brings performance bottlenecks. To tackle this challenge, we propose a framework named MamMIL for WSI analysis by cooperating the selective structured state space model (i.e., Mamba) with MIL, enabling the modeling of global instance dependencies while maintaining linear complexity. Specifically, considering the irregularity of the tissue regions in WSIs, we represent each WSI as an undirected graph. To address the problem that Mamba can only process 1D sequences, we further propose a topology-aware scanning mechanism to serialize the WSI graphs while preserving the topological relationships among the instances. Finally, in order to further perceive the topological structures among the instances and incorporate short-range feature interactions, we propose an instance aggregation block based on graph neural networks. Experiments show that MamMIL can achieve advanced performance than the state-of-the-art frameworks. The code can be accessed at https://github.com/Vison307/MamMIL.

MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models

TL;DR

This work proposes a framework named MamMIL for WSI analysis by cooperating the selective structured state space model (i.e., Mamba) with MIL, enabling the modeling of global instance dependencies while maintaining linear complexity.

Abstract

Recently, pathological diagnosis has achieved superior performance by combining deep learning models with the multiple instance learning (MIL) framework using whole slide images (WSIs). However, the giga-pixeled nature of WSIs poses a great challenge for efficient MIL. Existing studies either do not consider global dependencies among instances, or use approximations such as linear attentions to model the pair-to-pair instance interactions, which inevitably brings performance bottlenecks. To tackle this challenge, we propose a framework named MamMIL for WSI analysis by cooperating the selective structured state space model (i.e., Mamba) with MIL, enabling the modeling of global instance dependencies while maintaining linear complexity. Specifically, considering the irregularity of the tissue regions in WSIs, we represent each WSI as an undirected graph. To address the problem that Mamba can only process 1D sequences, we further propose a topology-aware scanning mechanism to serialize the WSI graphs while preserving the topological relationships among the instances. Finally, in order to further perceive the topological structures among the instances and incorporate short-range feature interactions, we propose an instance aggregation block based on graph neural networks. Experiments show that MamMIL can achieve advanced performance than the state-of-the-art frameworks. The code can be accessed at https://github.com/Vison307/MamMIL.
Paper Structure (16 sections, 6 equations, 2 figures, 3 tables)

This paper contains 16 sections, 6 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: An overview of the MamMIL framework, which is composed of an instance feature extraction and graph construction stage, along with an instance feature aggregation stage.
  • Figure 2: Illustrations of the four traversal strategies. The index order traverses the MST using the node index. The pre-order traversal first accesses the root node, and then traverses each sub-tree recursively. The post-order traversal first traverses the sub-trees and accesses the root node at last. The level-order traversal accesses each node from shallower layers to deeper layers.