Segment Anything in Medical Images

Jun Ma; Yuting He; Feifei Li; Lin Han; Chenyu You; Bo Wang

Segment Anything in Medical Images

Jun Ma, Yuting He, Feifei Li, Lin Han, Chenyu You, Bo Wang

TL;DR

The authors show a deep learning model for efficient and accurate segmentation across a wide range of medical image modalities and anatomies, and hold significant potential to expedite the evolution of diagnostic tools and the personalization of treatment plans.

Abstract

Medical image segmentation is a critical component in clinical practice, facilitating accurate diagnosis, treatment planning, and disease monitoring. However, existing methods, often tailored to specific modalities or disease types, lack generalizability across the diverse spectrum of medical image segmentation tasks. Here we present MedSAM, a foundation model designed for bridging this gap by enabling universal medical image segmentation. The model is developed on a large-scale medical image dataset with 1,570,263 image-mask pairs, covering 10 imaging modalities and over 30 cancer types. We conduct a comprehensive evaluation on 86 internal validation tasks and 60 external validation tasks, demonstrating better accuracy and robustness than modality-wise specialist models. By delivering accurate and efficient segmentation across a wide spectrum of tasks, MedSAM holds significant potential to expedite the evolution of diagnostic tools and the personalization of treatment plans.

Segment Anything in Medical Images

TL;DR

Abstract

Paper Structure (20 sections, 5 equations, 5 figures)

This paper contains 20 sections, 5 equations, 5 figures.

Figures (5)

Figure 1: MedSAM is trained on a large-scale dataset that can handle diverse segmentation tasks. The dataset covers a variety of anatomical structures, pathological conditions, and medical imaging modalities. The magenta contours and mask overlays denote the expert annotations and MedSAM segmentation results, respectively.
Figure 2: a, The number of medical image-mask pairs in each modality. b, MedSAM is a promptable segmentation method where users can use bounding boxes to specify the segmentation targets. Source data are provided as a Source Data file.
Figure 3: Quantitative and qualitative evaluation results on the internal validation set.a, Performance distribution of 86 internal validation tasks in terms of median Dice Similarity Coefficient (DSC) score. The center line within the box represents the median value, with the bottom and top bounds of the box delineating the 25th and 75th percentiles, respectively. Whiskers are chosen to show the 1.5 of the interquartile range. Up-triangles denote the minima and down-triangles denote the maxima. b, Podium plots for visualizing the performance correspondence of 86 internal validation tasks. Upper part: each colored dot denotes the median DSC achieved with the respective method on one task. Dots corresponding to identical tasks are connected by a line. Lower part: bar charts represent the frequency of achieved ranks for each method. MedSAM ranks in the first place on most tasks. c, Visualized segmentation examples on the internal validation set. The four examples are liver cancer, brain cancer, breast cancer, and polyp in Computed Tomography (CT), (Magnetic Resonance Imaging) MRI, ultrasound, and endoscopy images, respectively. Blue: bounding box prompts; Yellow: segmentation results. Magenta: expert annotations. Source data are provided as a Source Data file.
Figure 4: Quantitative and qualitative evaluation results on the external validation set.a, Performance distribution of 60 external validation tasks in terms of median Dice Similarity Coefficient (DSC) score. The center line within the box represents the median value, with the bottom and top bounds of the box delineating the 25th and 75th percentiles, respectively. Whiskers are chosen to show the 1.5 of the interquartile range. Up-triangles denote the minima and down-triangles denote the maxima. b, Podium plots for visualizing the performance correspondence of 60 external validation tasks. Upper part: each colored dot denotes the median DSC achieved with the respective method on one task. Dots corresponding to identical tasks are connected by a line. Lower part: bar charts represent the frequency of achieved ranks for each method. MedSAM ranks in the first place on most tasks. c, Visualized segmentation examples on the external validation set. The four examples are the lymph node, cervical cancer, fetal head, and polyp in CT, MR, ultrasound, and endoscopy images, respectively. Source data are provided as a Source Data file.
Figure 5: a, Scaling up the training image size to one million can significantly improve the model performance on both internal and external validation sets. b, MedSAM can be used to substantially reduce the annotation time cost. Source data are provided as a Source Data file.