Table of Contents
Fetching ...

Prototype-Based Image Prompting for Weakly Supervised Histopathological Image Segmentation

Qingchen Tang, Lei Fan, Maurice Pagnucco, Yang Song

TL;DR

This paper tackles the challenge of weakly supervised segmentation in histopathology by introducing PBIP, a prototype-based image prompting framework. PBIP constructs a multi-prototype image bank and uses a contrastive prototype–foreground matching objective to refine CAMs, addressing inter-class homogeneity and intra-class heterogeneity. It combines a SegFormer-based ClassNet with a MedCLIP-powered ImgMatchNet to generate robust pseudo-masks and then trains a second-stage fully supervised model, achieving state-of-the-art results on four histopathology datasets. The results underscore the efficacy of image-based prompts over text prompts in this domain and highlight the importance of carefully designed prototype-based supervision for complex tissue structures.

Abstract

Weakly supervised image segmentation with image-level labels has drawn attention due to the high cost of pixel-level annotations. Traditional methods using Class Activation Maps (CAMs) often highlight only the most discriminative regions, leading to incomplete masks. Recent approaches that introduce textual information struggle with histopathological images due to inter-class homogeneity and intra-class heterogeneity. In this paper, we propose a prototype-based image prompting framework for histopathological image segmentation. It constructs an image bank from the training set using clustering, extracting multiple prototype features per class to capture intra-class heterogeneity. By designing a matching loss between input features and class-specific prototypes using contrastive learning, our method addresses inter-class homogeneity and guides the model to generate more accurate CAMs. Experiments on four datasets (LUAD-HistoSeg, BCSS-WSSS, GCSS, and BCSS) show that our method outperforms existing weakly supervised segmentation approaches, setting new benchmarks in histopathological image segmentation.

Prototype-Based Image Prompting for Weakly Supervised Histopathological Image Segmentation

TL;DR

This paper tackles the challenge of weakly supervised segmentation in histopathology by introducing PBIP, a prototype-based image prompting framework. PBIP constructs a multi-prototype image bank and uses a contrastive prototype–foreground matching objective to refine CAMs, addressing inter-class homogeneity and intra-class heterogeneity. It combines a SegFormer-based ClassNet with a MedCLIP-powered ImgMatchNet to generate robust pseudo-masks and then trains a second-stage fully supervised model, achieving state-of-the-art results on four histopathology datasets. The results underscore the efficacy of image-based prompts over text prompts in this domain and highlight the importance of carefully designed prototype-based supervision for complex tissue structures.

Abstract

Weakly supervised image segmentation with image-level labels has drawn attention due to the high cost of pixel-level annotations. Traditional methods using Class Activation Maps (CAMs) often highlight only the most discriminative regions, leading to incomplete masks. Recent approaches that introduce textual information struggle with histopathological images due to inter-class homogeneity and intra-class heterogeneity. In this paper, we propose a prototype-based image prompting framework for histopathological image segmentation. It constructs an image bank from the training set using clustering, extracting multiple prototype features per class to capture intra-class heterogeneity. By designing a matching loss between input features and class-specific prototypes using contrastive learning, our method addresses inter-class homogeneity and guides the model to generate more accurate CAMs. Experiments on four datasets (LUAD-HistoSeg, BCSS-WSSS, GCSS, and BCSS) show that our method outperforms existing weakly supervised segmentation approaches, setting new benchmarks in histopathological image segmentation.

Paper Structure

This paper contains 14 sections, 13 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: a. Four supervision frameworks for histopathological image segmentation: fully supervised with pixel-level masks, CAM-based WSS using image labels, textural-based WSS, and our image prompt-based framework. b. The challenges of inter-class homogeneity (variable texture and staining within classes) and intra-class heterogeneity (similar appearances across classes). Cosine similarities are computed using features extracted by the MedCLIP model wang2022medclip.
  • Figure 2: Structure of the proposed PBIP framework. Overview. PBIP consists of two main components: a Classification Network(ClassNet) and an Image Feature Matching Network(ImgMatchNet), which leverage an external image bank to provide image prompts in the form of prototypes. Image Bank. Training images are grouped by their labels and clustered into $K$ subclasses per class. For each subclass, $N_K$ representative images are selected to build the image bank. ClassNet. It receives an input image $\mathbf{X}$ and prototype features $\mathbf{P}$, performing a classification task to generate the pseudo-segmentation mask $\mathbf{M}$. ImgMatchNet. It processes the input image $\mathbf{X}$ and the initial pseudo-segmentation mask $\mathbf{M}$, extracting foreground and background regions. These regions are then matched with $\mathbf{P}$ from the image bank to refine the pseudo mask generation.
  • Figure 3: Ablation study on hyperparameter ratios. The mIoU values are reported on initial pseudo masks for BCSS-WSSS.
  • Figure 4: Ablation study of the number of prototype images. Proto Num represents the number of prototype images per sub-bank for each category. We report the mIoU values with Standard Deviation obtained over 10 runs with different random seeds.
  • Figure 5: a. Visualization of the pseudo-segmentation masks generated by models in the first stage on BCSS-WSSS. The masks are overlaid on the input images, with the raw pseudo-segmentation mask shown in the top-left corner. b. The foreground images generated by PBIP for the four segmentation targets. More visualizations are in Supplementary Material.