MedSAGa: Few-shot Memory Efficient Medical Image Segmentation using Gradient Low-Rank Projection in SAM
Navyansh Mahla, Annie D'souza, Shubh Gupta, Bhavik Kanekar, Kshitij Sharad Jadhav
TL;DR
MedSAGa tackles memory and data constraints in medical image segmentation by integrating Gradient Low-Rank Projection (GaLore) with the Segment Anything Model (SAM) to enable memory-efficient, few-shot fine-tuning of the image encoder. The prompt encoder and mask decoder are fine-tuned conventionally, preserving a lightweight training footprint while producing multiple masks that are fused into precise segmentation maps for $k$ classes. Across four diverse datasets, MedSAGa achieves substantial memory savings (about 66% more efficient on average) with segmentation performance competitive to SOTA baselines such as SAMed and DAE-Former in low-data regimes. This work demonstrates the practicality of GaLore-enabled ViT fine-tuning for medical image segmentation, enabling deployment on memory-constrained hardware without sacrificing accuracy.
Abstract
The application of large-scale models in medical image segmentation demands substantial quantities of meticulously annotated data curated by experts along with high computational resources, both of which are challenges in resource-poor settings. In this study, we present the Medical Segment Anything Model with Galore MedSAGa where we adopt the Segment Anything Model (SAM) to achieve memory-efficient, few-shot medical image segmentation by applying Gradient Low-Rank Projection GaLore to the parameters of the image encoder of SAM. Meanwhile, the weights of the prompt encoder and mask decoder undergo full parameter fine-tuning using standard optimizers. We further assess MedSAGa's few-shot learning capabilities, reporting on its memory efficiency and segmentation performance across multiple standard medical image segmentation datasets. We compare it with several baseline models, including LoRA fine-tuned SAM (SAMed) and DAE-Former. Experiments across multiple datasets and these baseline models with different number of images for fine tuning demonstrated that the GPU memory consumption of MedSAGa is significantly less than that of the baseline models, achieving an average memory efficiency of 66% more than current state-of-the-art (SOTA) models for medical image segmentation. The combination of substantially lower memory requirements and comparable to SOTA results in few-shot learning for medical image segmentation positions MedSAGa as an optimal solution for deployment in resource-constrained settings.
