Mamba2MIL: State Space Duality Based Multiple Instance Learning for Computational Pathology
Yuqi Zhang, Xiaoqian Zhang, Jiakai Wang, Yuancheng Yang, Taiying Peng, Chao Tong
TL;DR
The paper tackles the limitations of current MIL approaches in computational pathology, notably incomplete information utilization and insufficient fusion of diverse patch features, by introducing Mamba2MIL. It leverages a state space duality model (SSD) to model long WSI patch sequences and employs a sequence transformation (sequence squaring and reordering) to adapt to WSIs of varying sizes while preserving local sequence information. By processing three sequence orders (original, flipped, and transposed) through stacked SSD blocks and applying a hyperbolic tangent–based feature weighting, the method achieves robust, fused bag representations for classification. Across BRACS and NSCLC datasets, Mamba2MIL delivers state-of-the-art or near-state-of-the-art AUC and accuracy, demonstrating strong generalization and potential clinical impact in computational pathology.
Abstract
Computational pathology (CPath) has significantly advanced the clinical practice of pathology. Despite the progress made, Multiple Instance Learning (MIL), a promising paradigm within CPath, continues to face challenges, particularly related to incomplete information utilization. Existing frameworks, such as those based on Convolutional Neural Networks (CNNs), attention, and selective scan space state sequential model (SSM), lack sufficient flexibility and scalability in fusing diverse features, and cannot effectively fuse diverse features. Additionally, current approaches do not adequately exploit order-related and order-independent features, resulting in suboptimal utilization of sequence information. To address these limitations, we propose a novel MIL framework called Mamba2MIL. Our framework utilizes the state space duality model (SSD) to model long sequences of patches of whole slide images (WSIs), which, combined with weighted feature selection, supports the fusion processing of more branching features and can be extended according to specific application needs. Moreover, we introduce a sequence transformation method tailored to varying WSI sizes, which enhances sequence-independent features while preserving local sequence information, thereby improving sequence information utilization. Extensive experiments demonstrate that Mamba2MIL surpasses state-of-the-art MIL methods. We conducted extensive experiments across multiple datasets, achieving improvements in nearly all performance metrics. Specifically, on the NSCLC dataset, Mamba2MIL achieves a binary tumor classification AUC of 0.9533 and an accuracy of 0.8794. On the BRACS dataset, it achieves a multiclass classification AUC of 0.7986 and an accuracy of 0.4981. The code is available at https://github.com/YuqiZhang-Buaa/Mamba2MIL.
