SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model

Trevor J. Chan; Aarush Sahni; Yijin Fang; Jie Li; Alisha Luthra; Alison Pouch; Chamith S. Rajapakse

SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model

Trevor J. Chan, Aarush Sahni, Yijin Fang, Jie Li, Alisha Luthra, Alison Pouch, Chamith S. Rajapakse

TL;DR

The paper addresses the bottleneck of data-hungry 3D medical segmentation by proposing SAM3D, a zero-shot, semi-automatic method that extends the 2D Segment Anything Model to 3D through 3D polyline prompts, multi-axis slicing, and a reconstruction pipeline. It demonstrates that high-quality 3D segmentations can be achieved across CT and MRI modalities without additional training, outperforming a leading 2D medical adaptation (MedSAM) in pelvic CT and aorta segmentation while remaining fast. The approach reduces manual input and time, enabling rapid labeling and potential clinical utility in planning, education, and research, with robustness across imaging contrasts. The work suggests that leveraging rich prompts and volumetric redundancy can compensate for the lack of domain-specific 3D training data, pointing to data-efficient directions for future 3D segmentation research.

Abstract

We introduce SAM3D, a new approach to semi-automatic zero-shot segmentation of 3D images building on the existing Segment Anything Model. We achieve fast and accurate segmentations in 3D images with a four-step strategy involving: user prompting with 3D polylines, volume slicing along multiple axes, slice-wide inference with a pretrained model, and recomposition and refinement in 3D. We evaluated SAM3D performance qualitatively on an array of imaging modalities and anatomical structures and quantify performance for specific structures in abdominal pelvic CT and brain MRI. Notably, our method achieves good performance with zero model training or finetuning, making it particularly useful for tasks with a scarcity of preexisting labeled data. By enabling users to create 3D segmentations of unseen data quickly and with dramatically reduced manual input, these methods have the potential to aid surgical planning and education, diagnostic imaging, and scientific research.

SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model

TL;DR

Abstract

Paper Structure (5 sections, 3 figures, 1 table)

This paper contains 5 sections, 3 figures, 1 table.

INTRODUCTION
Methods
Results
Discussion
Conclusion

Figures (3)

Figure 1: An overview of the segmentation method comprising: polyline prompting on a 3D image, slicing along rotationally equispaced axes, 2D inference using SAM, recomposition into a dense point cloud, and voxelization/meshing.
Figure 2: Visualizing diverse segmentation performance. (a) Pelvis and sacral spine in CT. (b) Skeleton in ex vivo CT. (c) Cervical vertebra 3 in CTloffler2020vertebral. (d) Lungs in chest CTma2021toward. (e) Lungs, liver, and kidneys in abdominal CTLandman2015btcv. (f) Oxygenated blood pool in CT angiogram. (g) Glioblastoma tumor and edema in FLAIR MRImenze2014multimodal. (h) Lateral ventricles, cerebellum, and brain stem in T1 MRI. (i) Left ventricular outflow tract, aortic valve, and aortic root in 3D TEE. (j) Tumor lesion in 3D breast ultrasoundtdscabus. (k) Hippocampal axonol neurons in volumetric scanning electron microscopy (SEM)lucchi2011supervoxel. (l) Tobacco leaf cell central vacuoule in volumetric SEMwickramanayake2023conventional.
Figure 3: Quantification of segmentation accuracy on benchmark datasets, comparison against MedSAM, and additional experiments. (a) Dice score calculated for the lung and liver masks ($n=16$) on the BTCV dataset. (b) Dice score calculated for the tumor region (enhanced + nonenhanced tumor/necrotic) and the tumor+edema regions ($n=8$) for 4 MRI contrasts on the BraTS dataset. (c) Dice scores for segmentation performance on brain metastases ($n=4$), pelvis ($n=3$), and aorta ($n=4$) for both our method (SAM3D) and MedSAM. (d) An analysis of segmentation accuracy and time as a function of the number of transforms chosen for liver segmentation in the BTCV dataset ($n=3$) and tumor segmentation in the BraTS dataset ($n=3$) suggests that the optimal number of transforms to use depends heavily on the anatomical structure to segment. (e-g) Representative segmentation predictions shown in 2D and 3D in three zero-shot tasks for our method and MedSAM.

SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model

TL;DR

Abstract

SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model

Authors

TL;DR

Abstract

Table of Contents

Figures (3)