Mamba-Sea: A Mamba-based Framework with Global-to-Local Sequence Augmentation for Generalizable Medical Image Segmentation
Zihan Cheng, Jintao Guo, Jian Zhang, Lei Qi, Luping Zhou, Yinghuan Shi, Yang Gao
TL;DR
This work tackles the challenge of domain generalization in medical image segmentation where distribution shifts across sites degrade performance. It introduces Mamba-Sea, a Mamba-based framework that couples global appearance augmentation (GVA) with local sequence-wise style transformation (LSA) to diversify both global and sequence-level token dependencies, supplemented by semantic consistency training to enforce domain-invariant predictions. Empirical results on Fundus and Prostate DG benchmarks, plus a large-scale skin lesion dataset, show state-of-the-art Dice scores and favorable computational efficiency, with notable improvements over CNN- and ViT-based DG methods as well as SAM-based baselines. The approach demonstrates robust generalization under domain shifts and offers a scalable, modular pathway to leverage Mamba for DG in medical image segmentation, with clear avenues for future enhancements such as feature disentanglement and foundation-model integration.
Abstract
To segment medical images with distribution shifts, domain generalization (DG) has emerged as a promising setting to train models on source domains that can generalize to unseen target domains. Existing DG methods are mainly based on CNN or ViT architectures. Recently, advanced state space models, represented by Mamba, have shown promising results in various supervised medical image segmentation. The success of Mamba is primarily owing to its ability to capture long-range dependencies while keeping linear complexity with input sequence length, making it a promising alternative to CNNs and ViTs. Inspired by the success, in the paper, we explore the potential of the Mamba architecture to address distribution shifts in DG for medical image segmentation. Specifically, we propose a novel Mamba-based framework, Mamba-Sea, incorporating global-to-local sequence augmentation to improve the model's generalizability under domain shift issues. Our Mamba-Sea introduces a global augmentation mechanism designed to simulate potential variations in appearance across different sites, aiming to suppress the model's learning of domain-specific information. At the local level, we propose a sequence-wise augmentation along input sequences, which perturbs the style of tokens within random continuous sub-sequences by modeling and resampling style statistics associated with domain shifts. To our best knowledge, Mamba-Sea is the first work to explore the generalization of Mamba for medical image segmentation, providing an advanced and promising Mamba-based architecture with strong robustness to domain shifts. Remarkably, our proposed method is the first to surpass a Dice coefficient of 90% on the Prostate dataset, which exceeds previous SOTA of 88.61%. The code is available at https://github.com/orange-czh/Mamba-Sea.
