SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation
Javier Gamazo Tejero, Moritz Schmid, Pablo Márquez Neila, Martin S. Zinkernagel, Sebastian Wolf, Raphael Sznitman
TL;DR
Medical semantic segmentation suffers from domain shifts across devices and sites. This work introduces SAM-DA, a decoder-focused adapter that injects learnable prompts at each decoder layer to modulate embeddings with only a small fraction of trainable parameters, preserving the base SAM while enabling domain adaptation. Across four datasets and two task settings (fully supervised and test-time domain adaptation), SAM-DA achieves competitive or superior performance compared with full fine-tuning and encoder-focused PEFT baselines, while training under $1\%$ of SAM's parameters. Ablation studies elucidate the decoder placement, adapter size, and interaction with large natural-image pretraining, highlighting the practical value of a lightweight, generalizable adaptation strategy for medical imaging.
Abstract
This paper addresses the domain adaptation challenge for semantic segmentation in medical imaging. Despite the impressive performance of recent foundational segmentation models like SAM on natural images, they struggle with medical domain images. Beyond this, recent approaches that perform end-to-end fine-tuning of models are simply not computationally tractable. To address this, we propose a novel SAM adapter approach that minimizes the number of trainable parameters while achieving comparable performances to full fine-tuning. The proposed SAM adapter is strategically placed in the mask decoder, offering excellent and broad generalization capabilities and improved segmentation across both fully supervised and test-time domain adaptation tasks. Extensive validation on four datasets showcases the adapter's efficacy, outperforming existing methods while training less than 1% of SAM's total parameters.
