Table of Contents
Fetching ...

Segment Any Cell: A SAM-based Auto-prompting Fine-tuning Framework for Nuclei Segmentation

Saiyang Na, Yuzhi Guo, Feng Jiang, Hehuan Ma, Junzhou Huang

TL;DR

This work presents Segment Any Cell (SAC), a SAM-based auto-prompting fine-tuning framework for nuclei segmentation that jointly optimizes a LoRA‑augmented image encoder and an automatic, discriminative prompting pipeline. By applying Low-Rank Adaptation directly within the attention QKV matrices and introducing an auto-prompt generator that yields positive/negative prompts, SAC significantly improves segmentation accuracy over SAM variants and other baselines on MoNuSeg and DSB, while reducing reliance on expert prompts. The approach also enables efficient training and demonstrates robustness across tasks, including gland segmentation on GlaS, suggesting broad applicability to semantic segmentation tasks with minimal manual prompting. Overall, SAC offers a practical, automated solution for pathology workflows, combining enhanced model adaptability with intelligent prompting to improve nuclei segmentation performance. $W_{\,\Delta} = B A$, $h = W_{0}x + W_{\,\Delta}x = W_{Q/V}x + B A x$, and $M = \mathcal{F}(\theta_{u}, I)$ are central to the method's design and optimization.

Abstract

In the rapidly evolving field of AI research, foundational models like BERT and GPT have significantly advanced language and vision tasks. The advent of pretrain-prompting models such as ChatGPT and Segmentation Anything Model (SAM) has further revolutionized image segmentation. However, their applications in specialized areas, particularly in nuclei segmentation within medical imaging, reveal a key challenge: the generation of high-quality, informative prompts is as crucial as applying state-of-the-art (SOTA) fine-tuning techniques on foundation models. To address this, we introduce Segment Any Cell (SAC), an innovative framework that enhances SAM specifically for nuclei segmentation. SAC integrates a Low-Rank Adaptation (LoRA) within the attention layer of the Transformer to improve the fine-tuning process, outperforming existing SOTA methods. It also introduces an innovative auto-prompt generator that produces effective prompts to guide segmentation, a critical factor in handling the complexities of nuclei segmentation in biomedical imaging. Our extensive experiments demonstrate the superiority of SAC in nuclei segmentation tasks, proving its effectiveness as a tool for pathologists and researchers. Our contributions include a novel prompt generation strategy, automated adaptability for diverse segmentation tasks, the innovative application of Low-Rank Attention Adaptation in SAM, and a versatile framework for semantic segmentation challenges.

Segment Any Cell: A SAM-based Auto-prompting Fine-tuning Framework for Nuclei Segmentation

TL;DR

This work presents Segment Any Cell (SAC), a SAM-based auto-prompting fine-tuning framework for nuclei segmentation that jointly optimizes a LoRA‑augmented image encoder and an automatic, discriminative prompting pipeline. By applying Low-Rank Adaptation directly within the attention QKV matrices and introducing an auto-prompt generator that yields positive/negative prompts, SAC significantly improves segmentation accuracy over SAM variants and other baselines on MoNuSeg and DSB, while reducing reliance on expert prompts. The approach also enables efficient training and demonstrates robustness across tasks, including gland segmentation on GlaS, suggesting broad applicability to semantic segmentation tasks with minimal manual prompting. Overall, SAC offers a practical, automated solution for pathology workflows, combining enhanced model adaptability with intelligent prompting to improve nuclei segmentation performance. , , and are central to the method's design and optimization.

Abstract

In the rapidly evolving field of AI research, foundational models like BERT and GPT have significantly advanced language and vision tasks. The advent of pretrain-prompting models such as ChatGPT and Segmentation Anything Model (SAM) has further revolutionized image segmentation. However, their applications in specialized areas, particularly in nuclei segmentation within medical imaging, reveal a key challenge: the generation of high-quality, informative prompts is as crucial as applying state-of-the-art (SOTA) fine-tuning techniques on foundation models. To address this, we introduce Segment Any Cell (SAC), an innovative framework that enhances SAM specifically for nuclei segmentation. SAC integrates a Low-Rank Adaptation (LoRA) within the attention layer of the Transformer to improve the fine-tuning process, outperforming existing SOTA methods. It also introduces an innovative auto-prompt generator that produces effective prompts to guide segmentation, a critical factor in handling the complexities of nuclei segmentation in biomedical imaging. Our extensive experiments demonstrate the superiority of SAC in nuclei segmentation tasks, proving its effectiveness as a tool for pathologists and researchers. Our contributions include a novel prompt generation strategy, automated adaptability for diverse segmentation tasks, the innovative application of Low-Rank Attention Adaptation in SAM, and a versatile framework for semantic segmentation challenges.
Paper Structure (32 sections, 8 equations, 6 figures, 8 tables)

This paper contains 32 sections, 8 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Comparative segmentation results using the SAM demo: (a) Segmentation results for natural images using SAM. (b) Segmentation results for cell nucleus images using SAM with different prompting strategies: Unprofessional single prompt – A single random prompt provided by a non-expert, potentially acting as a noisy prompt leading to failed segmentation results; Professional single prompt – A single positive prompt given by a professional expert, such as a pathologist; Professional multiple prompts – Multiple positive prompts provided by a professional; Professional w/ negative prompts – A few negative prompts provided by a professional. The blue dots represent the given positive prompts, while the pink dots represent the negative prompts.
  • Figure 2: Overall framework of SAC. The medical images are initially fed into the frozen parametered SAM image encoder, where we apply a Low-Rank attention adapter on each attention layer for efficient generation of image embeddings. Concurrently, images are processed through our innovative auto-prompt generator, producing both positive and negative prompts. These prompts are then fed into SAM prompt encoder to obtain prompt embeddings. Last, both image and prompt embeddings are input into a trainable SAM mask decoder (fine-tuning) to produce the final segmentation results. Notably, during the inference phase, our framework also allows for manual input of prompts as an option, potentially aiding in the segmentation of cell nuclei.
  • Figure 3: Two visualization examples of the segmentation results of SAM-FT from MoNuSeg dataset. The sequence of images displayed from left to right corresponds to the original histopathological specimen, followed by processed outputs with varying prompts of expert intervention: 0-expert, 1-expert, and 3-expert annotations, respectively.
  • Figure 4: Dice score convergence over epochs for SAM, MSA, and SAC (ours) on the MoNuSeg dataset.
  • Figure 5: Illustration of 14 test images from MoNuSeg dataset. These images are segmented using only the auxiliary neural network with varying numbers of SAM prompts to demonstrate the effects of prompt quantity. From left to right: original image; segmentation mask; segmentation with 1 positive and 1 negative point; 3 positive and 3 negative points; 8 positive and 8 negative points; and 16 positive and 16 negative points.
  • ...and 1 more figures