Table of Contents
Fetching ...

Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions

Yichi Zhang, Zhenrong Shen, Rushi Jiao

TL;DR

The paper surveys Segment Anything Model (SAM) for medical image segmentation, focusing on zero-shot performance and adaptation strategies across diverse modalities. It analyzes how directly applying SAM yields inconsistent results due to domain shifts between natural and medical images, and surveys methods including full/partial fine-tuning, parameter-efficient adapters, auto-prompting, and 3D extensions to bridge the gap. The review highlights that while zero-shot SAM often falls short on medical tasks, carefully designed adaptations (e.g., MedSAM, DeSAM, AutoSAM, and 3D SAM variants) can achieve competitive performance and enable annotation-efficient workflows. These insights underscore the potential of foundation models in medical image analysis and point to large-scale medical datasets and multi-modal integration as critical future directions for clinically deployable segmentation models.

Abstract

Due to the inherent flexibility of prompting, foundation models have emerged as the predominant force in the fields of natural language processing and computer vision. The recent introduction of the Segment Anything Model (SAM) signifies a noteworthy expansion of the prompt-driven paradigm into the domain of image segmentation, thereby introducing a plethora of previously unexplored capabilities. However, the viability of its application to medical image segmentation remains uncertain, given the substantial distinctions between natural and medical images. In this work, we provide a comprehensive overview of recent endeavors aimed at extending the efficacy of SAM to medical image segmentation tasks, encompassing both empirical benchmarking and methodological adaptations. Additionally, we explore potential avenues for future research directions in SAM's role within medical image segmentation. While direct application of SAM to medical image segmentation does not yield satisfactory performance on multi-modal and multi-target medical datasets so far, numerous insights gleaned from these efforts serve as valuable guidance for shaping the trajectory of foundational models in the realm of medical image analysis. To support ongoing research endeavors, we maintain an active repository that contains an up-to-date paper list and a succinct summary of open-source projects at https://github.com/YichiZhang98/SAM4MIS.

Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions

TL;DR

The paper surveys Segment Anything Model (SAM) for medical image segmentation, focusing on zero-shot performance and adaptation strategies across diverse modalities. It analyzes how directly applying SAM yields inconsistent results due to domain shifts between natural and medical images, and surveys methods including full/partial fine-tuning, parameter-efficient adapters, auto-prompting, and 3D extensions to bridge the gap. The review highlights that while zero-shot SAM often falls short on medical tasks, carefully designed adaptations (e.g., MedSAM, DeSAM, AutoSAM, and 3D SAM variants) can achieve competitive performance and enable annotation-efficient workflows. These insights underscore the potential of foundation models in medical image analysis and point to large-scale medical datasets and multi-modal integration as critical future directions for clinically deployable segmentation models.

Abstract

Due to the inherent flexibility of prompting, foundation models have emerged as the predominant force in the fields of natural language processing and computer vision. The recent introduction of the Segment Anything Model (SAM) signifies a noteworthy expansion of the prompt-driven paradigm into the domain of image segmentation, thereby introducing a plethora of previously unexplored capabilities. However, the viability of its application to medical image segmentation remains uncertain, given the substantial distinctions between natural and medical images. In this work, we provide a comprehensive overview of recent endeavors aimed at extending the efficacy of SAM to medical image segmentation tasks, encompassing both empirical benchmarking and methodological adaptations. Additionally, we explore potential avenues for future research directions in SAM's role within medical image segmentation. While direct application of SAM to medical image segmentation does not yield satisfactory performance on multi-modal and multi-target medical datasets so far, numerous insights gleaned from these efforts serve as valuable guidance for shaping the trajectory of foundational models in the realm of medical image analysis. To support ongoing research endeavors, we maintain an active repository that contains an up-to-date paper list and a succinct summary of open-source projects at https://github.com/YichiZhang98/SAM4MIS.
Paper Structure (34 sections, 15 figures)

This paper contains 34 sections, 15 figures.

Figures (15)

  • Figure 1: A brief chronology of Segment Anything Model (SAM) SAM-Meta and its variants for medical image segmentation in 2023.
  • Figure 2: Overview of the architecture of Segment Anything Model (SAM), which adopts an image encoder to extract image embeddings, a prompt encoder to integrate user interactions via different prompt modes, and a mask decoder to predict segmentation masks by fusing image embeddings and prompt embeddings.
  • Figure 3: SAM's zero-shot evaluation pipeline on medical image segmentation in a large-scale empirical study SAM-SZU.
  • Figure 4: Quatitative segmentation results of SAM on 18 different imaging modalities in a large-scale empirical study SAM-SZU.
  • Figure 5: MedSAM MedSAM adapts SAM for medical image segmentation by freezing the prompt encoder while fine-tuning the image encoder and the mask decoder.
  • ...and 10 more figures