Table of Contents
Fetching ...

Medical Image Segmentation with SAM-generated Annotations

Iira Häkkinen, Iaroslav Melekhov, Erik Englesson, Hossein Azizpour, Juho Kannala

TL;DR

This study investigates using SAM as a medical data annotation tool by generating pseudo labels for CT segmentation tasks in MSD and training UNets in a weakly supervised manner. Through a systematic ablation of prompting strategies, the authors find that bounding box prompts provide robust pseudo labels and that UNets trained on these labels achieve performance close to fully supervised models across six abdominal organs. The work demonstrates that SAM-generated annotations can substantially reduce labeling effort while preserving accuracy, highlighting the Box prompt as a practical annotation workflow. Overall, the findings support wider adoption of SAM-based pseudo labeling for scalable medical image segmentation and motivate further exploration of domain-specific fine-tuning and prompting strategies.

Abstract

The field of medical image segmentation is hindered by the scarcity of large, publicly available annotated datasets. Not all datasets are made public for privacy reasons, and creating annotations for a large dataset is time-consuming and expensive, as it requires specialized expertise to accurately identify regions of interest (ROIs) within the images. To address these challenges, we evaluate the performance of the Segment Anything Model (SAM) as an annotation tool for medical data by using it to produce so-called "pseudo labels" on the Medical Segmentation Decathlon (MSD) computed tomography (CT) tasks. The pseudo labels are then used in place of ground truth labels to train a UNet model in a weakly-supervised manner. We experiment with different prompt types on SAM and find that the bounding box prompt is a simple yet effective method for generating pseudo labels. This method allows us to develop a weakly-supervised model that performs comparably to a fully supervised model.

Medical Image Segmentation with SAM-generated Annotations

TL;DR

This study investigates using SAM as a medical data annotation tool by generating pseudo labels for CT segmentation tasks in MSD and training UNets in a weakly supervised manner. Through a systematic ablation of prompting strategies, the authors find that bounding box prompts provide robust pseudo labels and that UNets trained on these labels achieve performance close to fully supervised models across six abdominal organs. The work demonstrates that SAM-generated annotations can substantially reduce labeling effort while preserving accuracy, highlighting the Box prompt as a practical annotation workflow. Overall, the findings support wider adoption of SAM-based pseudo labeling for scalable medical image segmentation and motivate further exploration of domain-specific fine-tuning and prompting strategies.

Abstract

The field of medical image segmentation is hindered by the scarcity of large, publicly available annotated datasets. Not all datasets are made public for privacy reasons, and creating annotations for a large dataset is time-consuming and expensive, as it requires specialized expertise to accurately identify regions of interest (ROIs) within the images. To address these challenges, we evaluate the performance of the Segment Anything Model (SAM) as an annotation tool for medical data by using it to produce so-called "pseudo labels" on the Medical Segmentation Decathlon (MSD) computed tomography (CT) tasks. The pseudo labels are then used in place of ground truth labels to train a UNet model in a weakly-supervised manner. We experiment with different prompt types on SAM and find that the bounding box prompt is a simple yet effective method for generating pseudo labels. This method allows us to develop a weakly-supervised model that performs comparably to a fully supervised model.
Paper Structure (10 sections, 3 equations, 3 figures, 3 tables)

This paper contains 10 sections, 3 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Pipeline. A set of 2D CT scans is propagated through the pre-trained SAM model kirillov_segment_2023 to obtain the corresponding pseudo segmentation masks (cf. Section \ref{['ssec:label-gen']}). Two independent UNet models are then trained (cf. Section \ref{['ssec:unet-training']} and Section \ref{['ssec:optimization']}) using ground truth and pseudo labels to perform semantic segmentation.
  • Figure 2: Qualitative semantic segmentation results. Each column displays one example case of each of the 6 segmentation tasks (from left to right: Liver, Lung, Pancreas, Hepatic Vessel, Spleen, Colon), and the rows from top to bottom display the ground truth label, SAM's predictions (with bounding box prompt), UNet's predictions, and predictions obtained by the UNet with pseudo labels.
  • Figure 3: Ablation on different prompt types. Segmentation masks from different prompts. Green and red stars are positive and negative points, respectively.