Table of Contents
Fetching ...

SAM.MD: Zero-shot medical image segmentation capabilities of the Segment Anything Model

Saikat Roy, Tassilo Wald, Gregor Koehler, Maximilian R. Rokuss, Nico Disch, Julius Holzschuh, David Zimmerer, Klaus H. Maier-Hein

TL;DR

The paper evaluates the zero-shot segmentation capabilities of SAM on abdominal CT organ segmentation using point and bounding-box prompts. By analyzing axial slices from the AMOS dataset, it demonstrates that bounding-box prompts, even with moderate jitter, can achieve Dice scores competitive with nnU-Net baselines, while single-box prompts outperform multi-point prompts. The findings suggest SAM's potential to accelerate interactive, semi-automatic clinical segmentation workflows and to serve as a robust starting point for domain adaptation to medical imaging. Overall, SAM shows promise as a prompt-driven tool that can speed up routine segmentation tasks without requiring extensive retraining. Despite not surpassing state-of-the-art fully automated methods, its utility in clinician-in-the-loop pipelines is highlighted.

Abstract

Foundation models have taken over natural language processing and image generation domains due to the flexibility of prompting. With the recent introduction of the Segment Anything Model (SAM), this prompt-driven paradigm has entered image segmentation with a hitherto unexplored abundance of capabilities. The purpose of this paper is to conduct an initial evaluation of the out-of-the-box zero-shot capabilities of SAM for medical image segmentation, by evaluating its performance on an abdominal CT organ segmentation task, via point or bounding box based prompting. We show that SAM generalizes well to CT data, making it a potential catalyst for the advancement of semi-automatic segmentation tools for clinicians. We believe that this foundation model, while not reaching state-of-the-art segmentation performance in our investigations, can serve as a highly potent starting point for further adaptations of such models to the intricacies of the medical domain. Keywords: medical image segmentation, SAM, foundation models, zero-shot learning

SAM.MD: Zero-shot medical image segmentation capabilities of the Segment Anything Model

TL;DR

The paper evaluates the zero-shot segmentation capabilities of SAM on abdominal CT organ segmentation using point and bounding-box prompts. By analyzing axial slices from the AMOS dataset, it demonstrates that bounding-box prompts, even with moderate jitter, can achieve Dice scores competitive with nnU-Net baselines, while single-box prompts outperform multi-point prompts. The findings suggest SAM's potential to accelerate interactive, semi-automatic clinical segmentation workflows and to serve as a robust starting point for domain adaptation to medical imaging. Overall, SAM shows promise as a prompt-driven tool that can speed up routine segmentation tasks without requiring extensive retraining. Despite not surpassing state-of-the-art fully automated methods, its utility in clinician-in-the-loop pipelines is highlighted.

Abstract

Foundation models have taken over natural language processing and image generation domains due to the flexibility of prompting. With the recent introduction of the Segment Anything Model (SAM), this prompt-driven paradigm has entered image segmentation with a hitherto unexplored abundance of capabilities. The purpose of this paper is to conduct an initial evaluation of the out-of-the-box zero-shot capabilities of SAM for medical image segmentation, by evaluating its performance on an abdominal CT organ segmentation task, via point or bounding box based prompting. We show that SAM generalizes well to CT data, making it a potential catalyst for the advancement of semi-automatic segmentation tools for clinicians. We believe that this foundation model, while not reaching state-of-the-art segmentation performance in our investigations, can serve as a highly potent starting point for further adaptations of such models to the intricacies of the medical domain. Keywords: medical image segmentation, SAM, foundation models, zero-shot learning
Paper Structure (9 sections, 1 figure, 1 table)

This paper contains 9 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: Examples of random point and jittered box prompts with subsequently generated segmentation masks. Prompt points and boxes are represented in green, while the obtained segmentations are shown in blue.