Table of Contents
Fetching ...

One-Prompt to Segment All Medical Images

Junde Wu, Jiayuan Zhu, Yueming Jin, Min Xu

TL;DR

The paper tackles universal medical image segmentation by eliminating task-specific retraining and enabling adaptation to unseen tasks with a single prompted sample. It introduces a One-Prompt framework combining a shared encoder, One-Prompt Former decoders, and a Prompt-Parser to fuse various prompt types (Click, BBox, Doodle, SegLab) into segmentation across diverse datasets. Trained on 64 datasets with over 3,000 clinician prompts and evaluated on 14 held-out tasks and 11 unseen datasets, the approach outperforms few-shot, one-shot, and SAM-based interactive baselines in zero-shot, transfer, and interactive settings, while maintaining efficiency and practical prompting requirements. The work provides a cost-effective, scalable solution for clinical workflows and includes code and data releases to support broad adoption and benchmarking.

Abstract

Large foundation models, known for their strong zero-shot generalization, have excelled in visual and language applications. However, applying them to medical image segmentation, a domain with diverse imaging types and target labels, remains an open challenge. Current approaches, such as adapting interactive segmentation models like Segment Anything Model (SAM), require user prompts for each sample during inference. Alternatively, transfer learning methods like few/one-shot models demand labeled samples, leading to high costs. This paper introduces a new paradigm toward the universal medical image segmentation, termed 'One-Prompt Segmentation.' One-Prompt Segmentation combines the strengths of one-shot and interactive methods. In the inference stage, with just \textbf{one prompted sample}, it can adeptly handle the unseen task in a single forward pass. We train One-Prompt Model on 64 open-source medical datasets, accompanied by the collection of over 3,000 clinician-labeled prompts. Tested on 14 previously unseen datasets, the One-Prompt Model showcases superior zero-shot segmentation capabilities, outperforming a wide range of related methods. The code and data is released as https://github.com/KidsWithTokens/one-prompt.

One-Prompt to Segment All Medical Images

TL;DR

The paper tackles universal medical image segmentation by eliminating task-specific retraining and enabling adaptation to unseen tasks with a single prompted sample. It introduces a One-Prompt framework combining a shared encoder, One-Prompt Former decoders, and a Prompt-Parser to fuse various prompt types (Click, BBox, Doodle, SegLab) into segmentation across diverse datasets. Trained on 64 datasets with over 3,000 clinician prompts and evaluated on 14 held-out tasks and 11 unseen datasets, the approach outperforms few-shot, one-shot, and SAM-based interactive baselines in zero-shot, transfer, and interactive settings, while maintaining efficiency and practical prompting requirements. The work provides a cost-effective, scalable solution for clinical workflows and includes code and data releases to support broad adoption and benchmarking.

Abstract

Large foundation models, known for their strong zero-shot generalization, have excelled in visual and language applications. However, applying them to medical image segmentation, a domain with diverse imaging types and target labels, remains an open challenge. Current approaches, such as adapting interactive segmentation models like Segment Anything Model (SAM), require user prompts for each sample during inference. Alternatively, transfer learning methods like few/one-shot models demand labeled samples, leading to high costs. This paper introduces a new paradigm toward the universal medical image segmentation, termed 'One-Prompt Segmentation.' One-Prompt Segmentation combines the strengths of one-shot and interactive methods. In the inference stage, with just \textbf{one prompted sample}, it can adeptly handle the unseen task in a single forward pass. We train One-Prompt Model on 64 open-source medical datasets, accompanied by the collection of over 3,000 clinician-labeled prompts. Tested on 14 previously unseen datasets, the One-Prompt Model showcases superior zero-shot segmentation capabilities, outperforming a wide range of related methods. The code and data is released as https://github.com/KidsWithTokens/one-prompt.
Paper Structure (15 sections, 3 equations, 8 figures, 2 tables)

This paper contains 15 sections, 3 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Medical segmentation involves a wide range of different organs, tissues and anatomies. One-Prompt Segmentation is a novel paradigm to building a foundation model that can generalize to unseen tasks. Given an unseen task, One-Prompt Model only needs the users to prompt one image to grasp the task, which is notably cost-effective comparing with interactive and few-shot segmentation.
  • Figure 2: An illustration of One-Prompt Model, which starts from (a) an overview of the pipeline, and continues with zoomed-in diagrams of individual Models, including (b) One-Prompt Former, and (c) Prompt-Parser.
  • Figure 3: One-Prompt Model v.s. Few/One-shot Models on 14 held-out test datasets with 4 different prompts.
  • Figure 4: One-Prompt Model v.s. Interactive Segmentation Models on 7 held-out datasets with One-Click and BBox prompts.
  • Figure 5: Visualized comparison of One-Prompt Model and few/zero-shot models. One-Prompt Model is given templates with prompts for the prediction. Few/zero-shot models are given templates with segmentation labels for the prediction.
  • ...and 3 more figures