Customized Segment Anything Model for Medical Image Segmentation
Kaidong Zhang, Dong Liu
TL;DR
Addresses medical image segmentation by repurposing a large-scale segmentation system (SAM) for semantic medical segmentation. Proposes SAMed, which uses LoRA to fine-tune the image encoder and trains the prompt encoder and mask decoder to output tissue-class masks without requiring prompts at inference. Demonstrates competitive Synapse results (DSC 81.88, HD 20.64) and shows training strategies (warmup, AdamW) that stabilize fine-tuning while keeping deployment/storage overhead low. Concludes that SAMed offers a practical, SAM-compatible path for domain-specific segmentation with extensive ablations supporting design choices.
Abstract
We propose SAMed, a general solution for medical image segmentation. Different from the previous methods, SAMed is built upon the large-scale image segmentation model, Segment Anything Model (SAM), to explore the new research paradigm of customizing large-scale models for medical image segmentation. SAMed applies the low-rank-based (LoRA) finetuning strategy to the SAM image encoder and finetunes it together with the prompt encoder and the mask decoder on labeled medical image segmentation datasets. We also observe the warmup finetuning strategy and the AdamW optimizer lead SAMed to successful convergence and lower loss. Different from SAM, SAMed could perform semantic segmentation on medical images. Our trained SAMed model achieves 81.88 DSC and 20.64 HD on the Synapse multi-organ segmentation dataset, which is on par with the state-of-the-art methods. We conduct extensive experiments to validate the effectiveness of our design. Since SAMed only updates a small fraction of the SAM parameters, its deployment cost and storage cost are quite marginal in practical usage. The code of SAMed is available at https://github.com/hitachinsk/SAMed.
