Table of Contents
Fetching ...

PGP-SAM: Prototype-Guided Prompt Learning for Efficient Few-Shot Medical Image Segmentation

Zhonghao Yan, Zijin Yin, Tianyu Lin, Xiangzhu Zeng, Kongming Liang, Zhanyu Ma

TL;DR

PGP-SAM tackles the challenge of adapting Segment Anything Model to medical image segmentation in data-scarce scenarios by introducing intra-class and inter-class prototypes that guide prompt generation. It couples Contextual Feature Modulation with Progressive Prototype Refinement and a Prototype-based Prompt Generator to synthesize effective prompts without large-scale annotations. Across public and private CT datasets, it achieves state-of-the-art few-shot Dice scores while using only a fraction of available slices, outperforming prompt-free and prompt-based SAM variants. This approach offers a practical path to efficient, specialized medical segmentation with minimal labeling burden and improved boundary precision.

Abstract

The Segment Anything Model (SAM) has demonstrated strong and versatile segmentation capabilities, along with intuitive prompt-based interactions. However, customizing SAM for medical image segmentation requires massive amounts of pixel-level annotations and precise point- or box-based prompt designs. To address these challenges, we introduce PGP-SAM, a novel prototype-based few-shot tuning approach that uses limited samples to replace tedious manual prompts. Our key idea is to leverage inter- and intra-class prototypes to capture class-specific knowledge and relationships. We propose two main components: (1) a plug-and-play contextual modulation module that integrates multi-scale information, and (2) a class-guided cross-attention mechanism that fuses prototypes and features for automatic prompt generation. Experiments on a public multi-organ dataset and a private ventricle dataset demonstrate that PGP-SAM achieves superior mean Dice scores compared with existing prompt-free SAM variants, while using only 10\% of the 2D slices.

PGP-SAM: Prototype-Guided Prompt Learning for Efficient Few-Shot Medical Image Segmentation

TL;DR

PGP-SAM tackles the challenge of adapting Segment Anything Model to medical image segmentation in data-scarce scenarios by introducing intra-class and inter-class prototypes that guide prompt generation. It couples Contextual Feature Modulation with Progressive Prototype Refinement and a Prototype-based Prompt Generator to synthesize effective prompts without large-scale annotations. Across public and private CT datasets, it achieves state-of-the-art few-shot Dice scores while using only a fraction of available slices, outperforming prompt-free and prompt-based SAM variants. This approach offers a practical path to efficient, specialized medical segmentation with minimal labeling burden and improved boundary precision.

Abstract

The Segment Anything Model (SAM) has demonstrated strong and versatile segmentation capabilities, along with intuitive prompt-based interactions. However, customizing SAM for medical image segmentation requires massive amounts of pixel-level annotations and precise point- or box-based prompt designs. To address these challenges, we introduce PGP-SAM, a novel prototype-based few-shot tuning approach that uses limited samples to replace tedious manual prompts. Our key idea is to leverage inter- and intra-class prototypes to capture class-specific knowledge and relationships. We propose two main components: (1) a plug-and-play contextual modulation module that integrates multi-scale information, and (2) a class-guided cross-attention mechanism that fuses prototypes and features for automatic prompt generation. Experiments on a public multi-organ dataset and a private ventricle dataset demonstrate that PGP-SAM achieves superior mean Dice scores compared with existing prompt-free SAM variants, while using only 10\% of the 2D slices.
Paper Structure (12 sections, 7 equations, 2 figures, 3 tables)

This paper contains 12 sections, 7 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: The qualitative results of PGP-SAM and other SAM variants, including prompt-free variant (SAM-LST and SAMed) and prompt-based variant (SurgicalSAM).
  • Figure 2: Architecture of PGP-SAM. As our main contribution, the Prototype-guided Prompt Learning Module consists of two identical stages and three key components: Contextual Feature Modulation (CFM), Progressive Prototype Refinement (PPR) and Prototype-based Prompt Generator (PPG).