Table of Contents
Fetching ...

ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation

Y. Liu, L. Lin, K. K. Y. Wong, X. Tang

TL;DR

ProCNS addresses the core challenge of weakly-supervised medical image segmentation by introducing two synergistic modules: PRSA, which calibrates prototypes through progressive, multi-scale affinities between spatial and semantic elements, and ANPM, which adaptively masks noisy regions to prevent erroneous prototype updates. The framework also provides soft supervision for identified noisy regions and can be integrated as a plug-in with existing WSS methods. Extensive experiments across six diverse medical imaging tasks demonstrate improved Dice scores and competitive efficiency, validating the approach's effectiveness in handling annotation sparsity and boundary ambiguity. The work highlights the potential for prototype-driven refinement and noise-aware supervision to advance label-efficient medical image analysis, while also exploring integration with foundation-model predictions as a future direction.

Abstract

Weakly-supervised segmentation (WSS) has emerged as a solution to mitigate the conflict between annotation cost and model performance by adopting sparse annotation formats (e.g., point, scribble, block, etc.). Typical approaches attempt to exploit anatomy and topology priors to directly expand sparse annotations into pseudo-labels. However, due to a lack of attention to the ambiguous edges in medical images and insufficient exploration of sparse supervision, existing approaches tend to generate erroneous and overconfident pseudo proposals in noisy regions, leading to cumulative model error and performance degradation. In this work, we propose a novel WSS approach, named ProCNS, encompassing two synergistic modules devised with the principles of progressive prototype calibration and noise suppression. Specifically, we design a Prototype-based Regional Spatial Affinity (PRSA) loss to maximize the pair-wise affinities between spatial and semantic elements, providing our model of interest with more reliable guidance. The affinities are derived from the input images and the prototype-refined predictions. Meanwhile, we propose an Adaptive Noise Perception and Masking (ANPM) module to obtain more enriched and representative prototype representations, which adaptively identifies and masks noisy regions within the pseudo proposals, reducing potential erroneous interference during prototype computation. Furthermore, we generate specialized soft pseudo-labels for the noisy regions identified by ANPM, providing supplementary supervision. Extensive experiments on six medical image segmentation tasks involving different modalities demonstrate that the proposed framework significantly outperforms representative state-of-the-art methods.

ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation

TL;DR

ProCNS addresses the core challenge of weakly-supervised medical image segmentation by introducing two synergistic modules: PRSA, which calibrates prototypes through progressive, multi-scale affinities between spatial and semantic elements, and ANPM, which adaptively masks noisy regions to prevent erroneous prototype updates. The framework also provides soft supervision for identified noisy regions and can be integrated as a plug-in with existing WSS methods. Extensive experiments across six diverse medical imaging tasks demonstrate improved Dice scores and competitive efficiency, validating the approach's effectiveness in handling annotation sparsity and boundary ambiguity. The work highlights the potential for prototype-driven refinement and noise-aware supervision to advance label-efficient medical image analysis, while also exploring integration with foundation-model predictions as a future direction.

Abstract

Weakly-supervised segmentation (WSS) has emerged as a solution to mitigate the conflict between annotation cost and model performance by adopting sparse annotation formats (e.g., point, scribble, block, etc.). Typical approaches attempt to exploit anatomy and topology priors to directly expand sparse annotations into pseudo-labels. However, due to a lack of attention to the ambiguous edges in medical images and insufficient exploration of sparse supervision, existing approaches tend to generate erroneous and overconfident pseudo proposals in noisy regions, leading to cumulative model error and performance degradation. In this work, we propose a novel WSS approach, named ProCNS, encompassing two synergistic modules devised with the principles of progressive prototype calibration and noise suppression. Specifically, we design a Prototype-based Regional Spatial Affinity (PRSA) loss to maximize the pair-wise affinities between spatial and semantic elements, providing our model of interest with more reliable guidance. The affinities are derived from the input images and the prototype-refined predictions. Meanwhile, we propose an Adaptive Noise Perception and Masking (ANPM) module to obtain more enriched and representative prototype representations, which adaptively identifies and masks noisy regions within the pseudo proposals, reducing potential erroneous interference during prototype computation. Furthermore, we generate specialized soft pseudo-labels for the noisy regions identified by ANPM, providing supplementary supervision. Extensive experiments on six medical image segmentation tasks involving different modalities demonstrate that the proposed framework significantly outperforms representative state-of-the-art methods.
Paper Structure (34 sections, 17 equations, 10 figures, 10 tables)

This paper contains 34 sections, 17 equations, 10 figures, 10 tables.

Figures (10)

  • Figure 1: Top: Examples of an optical coherence tomography angiography (OCTA) image, a fundus image, an endoscope image, a hematoxylin and eosin (H&E)-stained tissue image, a cardiac magnetic resonance image (Cardiac MRI) and a brain tumor magnetic resonance image (Brain Tumor MRI), coupled with their respective sparse annotations of diverse types. UA, BG, FAZ, OD, OC, PL, NC, LV, MYO, RV and BT respectively represent unlabeled region, background, foveal avascular zone, optic disc, optic cup, polyp, nuclei, left ventricle, myocardium, right ventricle and brain tumor. Bottom: Visualization of pseudo-label error maps generated by TreeEnergy liang2022tree, DMPLS luo2022scribble and our ProCNS.
  • Figure 2: An overview of ProCNS. UR, TR, BGR, RR and NR respectively represent the unlabeled region, target region, background region, reliable region and noisy region. Onehot and MAP respectively denote One-hot-encoding and masked average pooling. In the Initialization stage, a preliminary segmentation model is trained using the sparsely-annotated dataset to generate initial pseudo-labels. In the Main stage, the model is further fine-tuned using dense pseudo-labels. The Main stage consists of two crucial components: the PRSA loss and ANPM.
  • Figure 3: Ablation analysis results of the temporal ensembling strategy at the Initialization stage with regards to DSC. “+” is the average DSC, the central line indicates the median DSC and data points $\Circle$$\Square$ are outliers. “$*$” indicates $\rm p \leq 0.05$ from a Wilcoxon matched-pairs signed rank test.
  • Figure 4: Performance with varied trade-off coefficients $\lambda_3$ and $\lambda_4$.
  • Figure 5: Qualitative evaluation on the prototype-refined predictions. “w/ Refinement” and “w/o Refinement” respectively refer to the outputs of models employing ProCNS with and without using the prototype-refined strategy. $\Delta D$ denotes the DSC difference between “w/o Refinement” and “w/ Refinement”.
  • ...and 5 more figures