AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation

Chengyin Li; Prashant Khanduri; Yao Qiang; Rafi Ibn Sultan; Indrin Chetty; Dongxiao Zhu

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation

Chengyin Li, Prashant Khanduri, Yao Qiang, Rafi Ibn Sultan, Indrin Chetty, Dongxiao Zhu

Abstract

Segment Anything Model (SAM) is one of the pioneering prompt-based foundation models for image segmentation and has been rapidly adopted for various medical imaging applications. However, in clinical settings, creating effective prompts is notably challenging and time-consuming, requiring the expertise of domain specialists such as physicians. This requirement significantly diminishes SAM's primary advantage, its interactive capability with end users, in medical applications. Moreover, recent studies have indicated that SAM, originally designed for 2D natural images, performs suboptimally on 3D medical image segmentation tasks. This subpar performance is attributed to the domain gaps between natural and medical images and the disparities in spatial arrangements between 2D and 3D images, particularly in multi-organ segmentation applications. To overcome these challenges, we present a novel technique termed AutoProSAM. This method automates 3D multi-organ CT-based segmentation by leveraging SAM's foundational model capabilities without relying on domain experts for prompts. The approach utilizes parameter-efficient adaptation techniques to adapt SAM for 3D medical imagery and incorporates an effective automatic prompt learning paradigm specific to this domain. By eliminating the need for manual prompts, it enhances SAM's capabilities for 3D medical image segmentation and achieves state-of-the-art (SOTA) performance in CT-based multi-organ segmentation tasks. The code is in this {\href{https://github.com/ChengyinLee/AutoProSAM_2024}{link}}.

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation

Abstract

Paper Structure (25 sections, 2 equations, 4 figures, 6 tables)

This paper contains 25 sections, 2 equations, 4 figures, 6 tables.

Introduction
Related Work
Foundation Computer Vision Models
Parameter-efficient Model Fine-Tuning
Adapting SAM to Medical Images
Automatic Prompts Generation
Method
SAM Architecture
Handling 3D Medical Inputs
Positional Encoding Enhancement
Patch Embedding Adjustments
Adapting Attention Block
Bottleneck Modifications
Auto Prompt Generator
Mask Decoder
...and 10 more sections

Figures (4)

Figure 1: Challenges associated with using SAM for medical image segmentation include (A) a T-SNE plot of embeddings encoded by SAM's image encoder, showcasing differences between medical image datasets such as AMOS ji2022amos and BTCV landman2015miccai, and natural image datasets like ADE20K zhou2017scene and COCO lin2014microsoft; (B) The requirement of manually generated prompts from domain experts for SAM-based medical image segmentation.
Figure 2: (A) The overall architecture of the AutoProSAM, (B) the design of the Depth Adapter module, which utilizes parameter-efficient model fine-tuning, and (C) the architecture of the Auto Prompt Generator, featuring a U-Net-like encoder-decoder design.
Figure 3: Qualitative visualizations compare our AutoProSAM with baseline methods using three subjects from public datasets (Rows 1-3) and two subjects from the private Institutional Pelvic dataset (Rows 4-5). Enhanced areas in these visualizations illustrate improvements in segmenting the left kidney (light red) and pancreas (beige). Additionally, segmentation masks are shown for the prostate (blue), bladder (green), and rectum (red).
Figure 4: Qualitative visualizations compare our AutoProSAM with baseline methods over the CT-ORG dataset.

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation

Abstract

AutoProSAM: Automated Prompting SAM for 3D Multi-Organ Segmentation

Authors

Abstract

Table of Contents

Figures (4)