Table of Contents
Fetching ...

MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology

Shu Yang, Yihui Wang, Hao Chen

TL;DR

This paper tackles limitations in Multiple Instance Learning for Whole Slide Images, notably insufficient interaction among instances and high computational overhead leading to overfitting. It introduces MambaMIL, which integrates the Selective Scan Space State Sequential Model (Mamba) for efficient long-sequence modeling, powered by a novel Sequence Reordering Mamba (SR-Mamba) core. SR-Mamba creates two long-sequence views—one preserving original order and another via segment-based rearrangement and feature re-embedding—and fuses their representations to enhance discriminative power. Across nine public datasets and two downstream tasks, MambaMIL achieves state-of-the-art performance and demonstrates robustness, with potential for extending to multi-modal computations in pathology.

Abstract

Multiple Instance Learning (MIL) has emerged as a dominant paradigm to extract discriminative feature representations within Whole Slide Images (WSIs) in computational pathology. Despite driving notable progress, existing MIL approaches suffer from limitations in facilitating comprehensive and efficient interactions among instances, as well as challenges related to time-consuming computations and overfitting. In this paper, we incorporate the Selective Scan Space State Sequential Model (Mamba) in Multiple Instance Learning (MIL) for long sequence modeling with linear complexity, termed as MambaMIL. By inheriting the capability of vanilla Mamba, MambaMIL demonstrates the ability to comprehensively understand and perceive long sequences of instances. Furthermore, we propose the Sequence Reordering Mamba (SR-Mamba) aware of the order and distribution of instances, which exploits the inherent valuable information embedded within the long sequences. With the SR-Mamba as the core component, MambaMIL can effectively capture more discriminative features and mitigate the challenges associated with overfitting and high computational overhead. Extensive experiments on two public challenging tasks across nine diverse datasets demonstrate that our proposed framework performs favorably against state-of-the-art MIL methods. The code is released at https://github.com/isyangshu/MambaMIL.

MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology

TL;DR

This paper tackles limitations in Multiple Instance Learning for Whole Slide Images, notably insufficient interaction among instances and high computational overhead leading to overfitting. It introduces MambaMIL, which integrates the Selective Scan Space State Sequential Model (Mamba) for efficient long-sequence modeling, powered by a novel Sequence Reordering Mamba (SR-Mamba) core. SR-Mamba creates two long-sequence views—one preserving original order and another via segment-based rearrangement and feature re-embedding—and fuses their representations to enhance discriminative power. Across nine public datasets and two downstream tasks, MambaMIL achieves state-of-the-art performance and demonstrates robustness, with potential for extending to multi-modal computations in pathology.

Abstract

Multiple Instance Learning (MIL) has emerged as a dominant paradigm to extract discriminative feature representations within Whole Slide Images (WSIs) in computational pathology. Despite driving notable progress, existing MIL approaches suffer from limitations in facilitating comprehensive and efficient interactions among instances, as well as challenges related to time-consuming computations and overfitting. In this paper, we incorporate the Selective Scan Space State Sequential Model (Mamba) in Multiple Instance Learning (MIL) for long sequence modeling with linear complexity, termed as MambaMIL. By inheriting the capability of vanilla Mamba, MambaMIL demonstrates the ability to comprehensively understand and perceive long sequences of instances. Furthermore, we propose the Sequence Reordering Mamba (SR-Mamba) aware of the order and distribution of instances, which exploits the inherent valuable information embedded within the long sequences. With the SR-Mamba as the core component, MambaMIL can effectively capture more discriminative features and mitigate the challenges associated with overfitting and high computational overhead. Extensive experiments on two public challenging tasks across nine diverse datasets demonstrate that our proposed framework performs favorably against state-of-the-art MIL methods. The code is released at https://github.com/isyangshu/MambaMIL.
Paper Structure (11 sections, 9 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 11 sections, 9 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overview of MambaMIL. Given a set of patches cropped from a slide, we sequentially utilize Feature Extractor, Linear Projection, stacked SR-Mamba modules and Aggregation for WSI analysis.
  • Figure 2: Illustration of Sequence Reordering Operation.
  • Figure 3: The performance comparison between TransMIL and our MambaMIL on the BRCAS validation set throughout the training process.