ProMISe: Promptable Medical Image Segmentation using SAM

Jinfeng Wang; Sifan Song; Xinkun Wang; Yiyi Wang; Yiyi Miao; Jionglong Su; S. Kevin Zhou

ProMISe: Promptable Medical Image Segmentation using SAM

Jinfeng Wang, Sifan Song, Xinkun Wang, Yiyi Wang, Yiyi Miao, Jionglong Su, S. Kevin Zhou

TL;DR

An Auto-Prompting Module (APM), which provides SAM-based foundation model with Euclidean adaptive prompts in the target domain with significantly improve SAM's non-fine-tuned performance in MIS, and a novel non-invasive method called Incremental Pattern Shifting (IPS) to adapt SAM to specific medical domains.

Abstract

With the proposal of the Segment Anything Model (SAM), fine-tuning SAM for medical image segmentation (MIS) has become popular. However, due to the large size of the SAM model and the significant domain gap between natural and medical images, fine-tuning-based strategies are costly with potential risk of instability, feature damage and catastrophic forgetting. Furthermore, some methods of transferring SAM to a domain-specific MIS through fine-tuning strategies disable the model's prompting capability, severely limiting its utilization scenarios. In this paper, we propose an Auto-Prompting Module (APM), which provides SAM-based foundation model with Euclidean adaptive prompts in the target domain. Our experiments demonstrate that such adaptive prompts significantly improve SAM's non-fine-tuned performance in MIS. In addition, we propose a novel non-invasive method called Incremental Pattern Shifting (IPS) to adapt SAM to specific medical domains. Experimental results show that the IPS enables SAM to achieve state-of-the-art or competitive performance in MIS without the need for fine-tuning. By coupling these two methods, we propose ProMISe, an end-to-end non-fine-tuned framework for Promptable Medical Image Segmentation. Our experiments demonstrate that both using our methods individually or in combination achieves satisfactory performance in low-cost pattern shifting, with all of SAM's parameters frozen.

ProMISe: Promptable Medical Image Segmentation using SAM

TL;DR

Abstract

Paper Structure (17 sections, 3 equations, 5 figures, 6 tables)

This paper contains 17 sections, 3 equations, 5 figures, 6 tables.

Introduction
Adaptive Prompt
Motivation
Proposed Method
Incremental Pattern Shifting
Rethink the mask decoder
Proposed Method
ProMISe framework
Experiments
Experimental Setup
Adaptive Prompt
Pattern Shifting
ProMISe framework
Multi-Modality Experiments
Ablation Study
...and 2 more sections

Figures (5)

Figure 1: Overview of ProMISe. All three components of the original SAM are frozen. The Auto-Prompting Module (APM) leverages features from the image encoder to predict optimal prompts in Euclidean space. The Pattern Embedding (PaE) module analyzes the image embedding to extract pattern gaps between the target and source domains. The Incremental Pattern Shifting (IPS) tokens are added to the mask tokens of the output tokens to realize the decoder's shifting of mask patterns.
Figure 2: Detailed structure of APM and PaE. The APM can be implemented using various operators and modules, such as CNN and Transformer.
Figure 3: Theoretical illustration of IPS. Left: Image with point prompts; Middle: Output mask from vanilla SAM; Right: Output mask from SAM with IPS. Arrows represent patterns shifting.
Figure 4: Comparison of the performance with different prompt number in endoscopy datasets. The 3P, 5P and 16P contain 1, 2, and 8 positive, as well as 2, 3, and 8 negative points, respectively.
Figure 5: Multi-modality training performance of pattern shifting with 16 points (mDice scores). The corresponding results of MedSAM-2D and our proposed IPS are represented by red lines and green dashed lines, respectively, as shown in Table \ref{['tab: IPS1']} and \ref{['tab: IPS2']}. The blue bars represent the performance of IPS training with both endoscopy and dermoscopy images.

ProMISe: Promptable Medical Image Segmentation using SAM

TL;DR

Abstract

ProMISe: Promptable Medical Image Segmentation using SAM

Authors

TL;DR

Abstract

Table of Contents

Figures (5)