SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation

Javier Gamazo Tejero; Moritz Schmid; Pablo Márquez Neila; Martin S. Zinkernagel; Sebastian Wolf; Raphael Sznitman

SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation

Javier Gamazo Tejero, Moritz Schmid, Pablo Márquez Neila, Martin S. Zinkernagel, Sebastian Wolf, Raphael Sznitman

TL;DR

Medical semantic segmentation suffers from domain shifts across devices and sites. This work introduces SAM-DA, a decoder-focused adapter that injects learnable prompts at each decoder layer to modulate embeddings with only a small fraction of trainable parameters, preserving the base SAM while enabling domain adaptation. Across four datasets and two task settings (fully supervised and test-time domain adaptation), SAM-DA achieves competitive or superior performance compared with full fine-tuning and encoder-focused PEFT baselines, while training under $1\%$ of SAM's parameters. Ablation studies elucidate the decoder placement, adapter size, and interaction with large natural-image pretraining, highlighting the practical value of a lightweight, generalizable adaptation strategy for medical imaging.

Abstract

This paper addresses the domain adaptation challenge for semantic segmentation in medical imaging. Despite the impressive performance of recent foundational segmentation models like SAM on natural images, they struggle with medical domain images. Beyond this, recent approaches that perform end-to-end fine-tuning of models are simply not computationally tractable. To address this, we propose a novel SAM adapter approach that minimizes the number of trainable parameters while achieving comparable performances to full fine-tuning. The proposed SAM adapter is strategically placed in the mask decoder, offering excellent and broad generalization capabilities and improved segmentation across both fully supervised and test-time domain adaptation tasks. Extensive validation on four datasets showcases the adapter's efficacy, outperforming existing methods while training less than 1% of SAM's total parameters.

SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation

TL;DR

of SAM's parameters. Ablation studies elucidate the decoder placement, adapter size, and interaction with large natural-image pretraining, highlighting the practical value of a lightweight, generalizable adaptation strategy for medical imaging.

Abstract

Paper Structure (21 sections, 3 equations, 6 figures, 11 tables)

This paper contains 21 sections, 3 equations, 6 figures, 11 tables.

Introduction
Related work
Parameter-Efficient Fine-Tuning
Domain Adaptation for Semantic Segmentation
Method
Segment Anything Model
SAM Decoder Adapter
Experimental setup
Datasets
Baselines
Fully Supervised Training Experiments
Test-Time Domain Adaptation Experiments
Results
Full Supervision
Domain Generalization
...and 6 more sections

Figures (6)

Figure 1: Predictions of the proposed method on three of the four studied datasets: Retouch retouch, MRI mri_dataset, and HQSeg-44k ke2024segment. For the medical datasets, we show the training domain on the top row and a different domain on the bottom (Specralis and Cirrus for Retouch, BMC and UCL for MRI). For HQSeg-44k, both images come from HRSOD hrsod_zeng2019towards.
Figure 2: Illustration of the proposed adaptation for SAM Decoder in layer $\ell$. In each layer, the adaptation embeddings $A_\ell$ are fed along with the dense embeddings $T_\ell$ to the trainable zero-initialized attention module, where the dense embeddings $T_\ell$ act as queries and the adaption $A_\ell$, as keys and values. Then, the resulting tokens $S_\ell$ are projected back to the model dimension with a linear layer (omitted in the figure) and finally combined with the decoder embeddings via a trainable gating parameter $g_\ell$ and a linear MLP, resulting in $T'_\ell$, which substitutes the previous dense embeddings. A detailed neural circuit diagram abbott2024neural can be found in the supplementary material.
Figure 3: Qualitative results on eight randomly selected in-domain test samples.
Figure 4: Qualitative and quantitative results on Retouch and MRI datasets for the proposed model and LoRA. For reference, each image includes its IoU score after five TTDA iterations.
Figure 5: Neural circuit diagram for the proposed SAM-Decoder-Adapter
...and 1 more figures

SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation

TL;DR

Abstract

SAM-DA: Decoder Adapter for Efficient Medical Domain Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)