Table of Contents
Fetching ...

WeakMedSAM: Weakly-Supervised Medical Image Segmentation via SAM with Sub-Class Exploration and Prompt Affinity Mining

Haoran Wang, Lian Huai, Wenbin Li, Lei Qi, Xingqun Jiang, Yinghuan Shi

TL;DR

This work addresses the high labeling cost of medical image segmentation by adapting the Segment Anything Model (SAM) to a weakly-supervised regime. It introduces Sub-Class Exploration to disentangle intra-class co-occurrence and Prompt Affinity Mining to refine class activations through a light-weight, prompt-based affinity propagation, yielding accurate segmentations with only image-level labels. The method achieves strong performance across BraTS 2019, AbdomenCT-1K, and MSD Cardiac datasets, outperforming recent weakly-supervised approaches and approaching interactive SAM methods without requiring pixel-level annotations. Its plug-and-play design, compatibility with multiple SAM backbones, and low computational overhead make it practical for clinical deployment and adaptable to future end-to-end refinements. The work advances label-efficient medical image segmentation by leveraging SAM’s capabilities to mitigate co-occurrence artifacts and to incorporate structural information through prompts.

Abstract

We have witnessed remarkable progress in foundation models in vision tasks. Currently, several recent works have utilized the segmenting anything model (SAM) to boost the segmentation performance in medical images, where most of them focus on training an adaptor for fine-tuning a large amount of pixel-wise annotated medical images following a fully supervised manner. In this paper, to reduce the labeling cost, we investigate a novel weakly-supervised SAM-based segmentation model, namely WeakMedSAM. Specifically, our proposed WeakMedSAM contains two modules: 1) to mitigate severe co-occurrence in medical images, a sub-class exploration module is introduced to learn accurate feature representations. 2) to improve the quality of the class activation maps, our prompt affinity mining module utilizes the prompt capability of SAM to obtain an affinity map for random-walk refinement. Our method can be applied to any SAM-like backbone, and we conduct experiments with SAMUS and EfficientSAM. The experimental results on three popularly-used benchmark datasets, i.e., BraTS 2019, AbdomenCT-1K, and MSD Cardiac dataset, show the promising results of our proposed WeakMedSAM. Our code is available at https://github.com/wanghr64/WeakMedSAM.

WeakMedSAM: Weakly-Supervised Medical Image Segmentation via SAM with Sub-Class Exploration and Prompt Affinity Mining

TL;DR

This work addresses the high labeling cost of medical image segmentation by adapting the Segment Anything Model (SAM) to a weakly-supervised regime. It introduces Sub-Class Exploration to disentangle intra-class co-occurrence and Prompt Affinity Mining to refine class activations through a light-weight, prompt-based affinity propagation, yielding accurate segmentations with only image-level labels. The method achieves strong performance across BraTS 2019, AbdomenCT-1K, and MSD Cardiac datasets, outperforming recent weakly-supervised approaches and approaching interactive SAM methods without requiring pixel-level annotations. Its plug-and-play design, compatibility with multiple SAM backbones, and low computational overhead make it practical for clinical deployment and adaptable to future end-to-end refinements. The work advances label-efficient medical image segmentation by leveraging SAM’s capabilities to mitigate co-occurrence artifacts and to incorporate structural information through prompts.

Abstract

We have witnessed remarkable progress in foundation models in vision tasks. Currently, several recent works have utilized the segmenting anything model (SAM) to boost the segmentation performance in medical images, where most of them focus on training an adaptor for fine-tuning a large amount of pixel-wise annotated medical images following a fully supervised manner. In this paper, to reduce the labeling cost, we investigate a novel weakly-supervised SAM-based segmentation model, namely WeakMedSAM. Specifically, our proposed WeakMedSAM contains two modules: 1) to mitigate severe co-occurrence in medical images, a sub-class exploration module is introduced to learn accurate feature representations. 2) to improve the quality of the class activation maps, our prompt affinity mining module utilizes the prompt capability of SAM to obtain an affinity map for random-walk refinement. Our method can be applied to any SAM-like backbone, and we conduct experiments with SAMUS and EfficientSAM. The experimental results on three popularly-used benchmark datasets, i.e., BraTS 2019, AbdomenCT-1K, and MSD Cardiac dataset, show the promising results of our proposed WeakMedSAM. Our code is available at https://github.com/wanghr64/WeakMedSAM.

Paper Structure

This paper contains 38 sections, 5 equations, 16 figures, 9 tables.

Figures (16)

  • Figure 1: Challenges in weakly-supervised medical image segmentation. The yellow line represents the ground truth. Compared to natural images, medical images suffer more from co-occurrence phenomena, and CAM tends to activate spurious areas. For example, the target of brain tumors incorrectly activates the surrounding edema area, while the target of horses does not mistakenly activate the rider.
  • Figure 2: The overall framework of WeakMedSAM. Before training, WeakMedSAM utilizes a pretrained network to extract image features and perform pre-clustering. During the training process, all parameters of SAM are frozen. The generated sub-class labels provide additional classification supervision. Then the CAMs are combined with prompt affinity maps for random walks, resulting in the final pseudo-labels.
  • Figure 3: Activated regions of sub-class and primary class classification. Different sub-classes within the same primary class trigger distinct intra-class discriminative regions. Through the utilization of sub-class classification heads extracting extraneous intra-class information, the primary class classification head is capable of acquiring a more robust inter-class activation representation.
  • Figure 4: Training process without introducing SCE. Focusing solely on optimizing the primary class classification loss $L_p$, can probably lead to improved sub-class feature representations and more accurate class activation regions.
  • Figure 5: Obtaining affinity map by applying point prompts on SAM.The image is partitioned into a uniform grid whose central points are subjected to the point prompts for the SAM. Then all the prediction maps $m^\text{aff}_i$ are aggregated and normalized to obtain the global affinity map $M^\text{aff}$.
  • ...and 11 more figures