Segment Any Medical Model Extended

Yihao Liu; Jiaming Zhang; Andres Diaz-Pinto; Haowei Li; Alejandro Martin-Gomez; Amir Kheradmand; Mehran Armand

Segment Any Medical Model Extended

Yihao Liu, Jiaming Zhang, Andres Diaz-Pinto, Haowei Li, Alejandro Martin-Gomez, Amir Kheradmand, Mehran Armand

TL;DR

A unified platform helps push the boundary of the foundation model for medical images, facilitating the use, modification, and validation of SAM and its variants in medical image segmentation and introduces SAMM Extended (SAMME), a platform that integrates new SAM variant models, adopts faster communication protocols, accommodates new interactive modes, and allows for fine-tuning of subcomponents of the models.

Abstract

The Segment Anything Model (SAM) has drawn significant attention from researchers who work on medical image segmentation because of its generalizability. However, researchers have found that SAM may have limited performance on medical images compared to state-of-the-art non-foundation models. Regardless, the community sees potential in extending, fine-tuning, modifying, and evaluating SAM for analysis of medical imaging. An increasing number of works have been published focusing on the mentioned four directions, where variants of SAM are proposed. To this end, a unified platform helps push the boundary of the foundation model for medical images, facilitating the use, modification, and validation of SAM and its variants in medical image segmentation. In this work, we introduce SAMM Extended (SAMME), a platform that integrates new SAM variant models, adopts faster communication protocols, accommodates new interactive modes, and allows for fine-tuning of subcomponents of the models. These features can expand the potential of foundation models like SAM, and the results can be translated to applications such as image-guided therapy, mixed reality interaction, robotic navigation, and data augmentation.

Segment Any Medical Model Extended

TL;DR

Abstract

Paper Structure (10 sections, 5 figures, 1 table)

This paper contains 10 sections, 5 figures, 1 table.

INTRODUCTION
Methods
Segmentation Process
Prompt Propagation
3D Bounding Box
Architecture
Integration of the SAM Variants
Results
CONCLUSION
Acknowledgments

Figures (5)

Figure 1: Examples of 3D bounding box prompts and the segmented 3D meshes using vanilla_vit_b model, which is a pretrain model provided along with original SAM. Datasets are from 3D Slicer sample datasets, listed sequentially, reading from left to right: MRBrainTumor1, CTA Abdomen, MRHead, CTLiver (here used for spine segmentation), CBCTDentalSurgery, and CTChest.
Figure 2: (a). The workflow for regular volumetric segmentation using 2D image segmentation tools. (b). The workflow for segmenting in SAMME using 2D prompts. (c). The workflow for segmenting in SAMME using 3D bounding boxes. In SAMME, the mask inference using 2D prompts is in realtime. At each cycle, the prompts are synchronized with the mask inference, so the "Enter Prompts" and "Get Masks" are iterative. The mask inference using 3D prompts is automated, so the workflow only takes in prompts once.
Figure 3: The architecture of SAMME. The 3D Slicer components handle the data storage, visualization, user interaction, as well as additional off-the-shelf functionalities. The SAMME Server runs the task queue which performs model computations and mask predictions. The SAMME Bridge interprets and converts the image coordinates data. The purposes of subcomponents are indicated in the corresponding angle brackets. The arrows in the figure are the communication channels between components.
Figure 4: (a) Prompt propagation on close slices along 3 views of medical images (3D Slicer sample data MRBrainTumor1). Segmentation of a volumetric image can be significantly simplified because of the similarity between close slices. The positions of the slices are shown at the top left corner of each raw image, ranging 20 mm in each view. This figure shows that the same prompt along a view can be propagated, and close slices do not need the re-entry of the prompts. In SAMME, this is done simply by scrolling the mouse, and the prompt will be synchronized to the next slice. (b) Inference results using the same 2D bounding box with different window and level values (3D Slicer sample data BaselineVolume). Different inference results demonstrate the performance discrepancies, indicating window and level values should be considered during training or fine-tuning of SAM variants. All tests in this figure use the vanilla_vit_b model.
Figure 5: Segmentation results of 5 different SAM models using the same 2D bounding box. The dataset is from 3D Slicer sample data: CT-MR Brain, CTACardio, and CTA Abdomen (Panoramix).

Segment Any Medical Model Extended

TL;DR

Abstract

Segment Any Medical Model Extended

Authors

TL;DR

Abstract

Table of Contents

Figures (5)