Table of Contents
Fetching ...

SuperCL: Superpixel Guided Contrastive Learning for Medical Image Segmentation Pre-training

Shuang Zeng, Lei Zhu, Xinliang Zhang, Hangzhou He, Yanye Lu

TL;DR

Medical image segmentation is hindered by scarce labeled data. SuperCL addresses this by pre-training with contrastive learning guided by superpixels: ILCP exploits pixel-level intra-image structure, while IGCP leverages inter-image global relationships via ASP and CCL to generate reliable weak labels; the overall objective combines $ \mathcal{L}_{total}=\lambda_1\mathcal{L}_{ins}+\lambda_2\mathcal{L}_{intra}+\lambda_3\mathcal{L}_{inter}$. Empirical results on eight downstream datasets show that SuperCL outperforms 12 SOTA CL baselines, achieving notable DSC and JC gains at 10% and maintaining improvements at 25% annotations; in some cases, 25% supervision with SuperCL approaches fully supervised performance. The approach demonstrates strong cross-domain generalization (CT/MRI) and is compatible with various U-Net backbones, underscoring its practical value for data-efficient medical image segmentation. Overall, SuperCL provides a scalable, annotation-efficient pre-training paradigm that improves segmentation accuracy and reliability in clinical imaging workflows.

Abstract

Medical image segmentation is a critical yet challenging task, primarily due to the difficulty of obtaining extensive datasets of high-quality, expert-annotated images. Contrastive learning presents a potential but still problematic solution to this issue. Because most existing methods focus on extracting instance-level or pixel-to-pixel representation, which ignores the characteristics between intra-image similar pixel groups. Moreover, when considering contrastive pairs generation, most SOTA methods mainly rely on manually setting thresholds, which requires a large number of gradient experiments and lacks efficiency and generalization. To address these issues, we propose a novel contrastive learning approach named SuperCL for medical image segmentation pre-training. Specifically, our SuperCL exploits the structural prior and pixel correlation of images by introducing two novel contrastive pairs generation strategies: Intra-image Local Contrastive Pairs (ILCP) Generation and Inter-image Global Contrastive Pairs (IGCP) Generation. Considering superpixel cluster aligns well with the concept of contrastive pairs generation, we utilize the superpixel map to generate pseudo masks for both ILCP and IGCP to guide supervised contrastive learning. Moreover, we also propose two modules named Average SuperPixel Feature Map Generation (ASP) and Connected Components Label Generation (CCL) to better exploit the prior structural information for IGCP. Finally, experiments on 8 medical image datasets indicate our SuperCL outperforms existing 12 methods. i.e. Our SuperCL achieves a superior performance with more precise predictions from visualization figures and 3.15%, 5.44%, 7.89% DSC higher than the previous best results on MMWHS, CHAOS, Spleen with 10% annotations. Our code will be released after acceptance.

SuperCL: Superpixel Guided Contrastive Learning for Medical Image Segmentation Pre-training

TL;DR

Medical image segmentation is hindered by scarce labeled data. SuperCL addresses this by pre-training with contrastive learning guided by superpixels: ILCP exploits pixel-level intra-image structure, while IGCP leverages inter-image global relationships via ASP and CCL to generate reliable weak labels; the overall objective combines . Empirical results on eight downstream datasets show that SuperCL outperforms 12 SOTA CL baselines, achieving notable DSC and JC gains at 10% and maintaining improvements at 25% annotations; in some cases, 25% supervision with SuperCL approaches fully supervised performance. The approach demonstrates strong cross-domain generalization (CT/MRI) and is compatible with various U-Net backbones, underscoring its practical value for data-efficient medical image segmentation. Overall, SuperCL provides a scalable, annotation-efficient pre-training paradigm that improves segmentation accuracy and reliability in clinical imaging workflows.

Abstract

Medical image segmentation is a critical yet challenging task, primarily due to the difficulty of obtaining extensive datasets of high-quality, expert-annotated images. Contrastive learning presents a potential but still problematic solution to this issue. Because most existing methods focus on extracting instance-level or pixel-to-pixel representation, which ignores the characteristics between intra-image similar pixel groups. Moreover, when considering contrastive pairs generation, most SOTA methods mainly rely on manually setting thresholds, which requires a large number of gradient experiments and lacks efficiency and generalization. To address these issues, we propose a novel contrastive learning approach named SuperCL for medical image segmentation pre-training. Specifically, our SuperCL exploits the structural prior and pixel correlation of images by introducing two novel contrastive pairs generation strategies: Intra-image Local Contrastive Pairs (ILCP) Generation and Inter-image Global Contrastive Pairs (IGCP) Generation. Considering superpixel cluster aligns well with the concept of contrastive pairs generation, we utilize the superpixel map to generate pseudo masks for both ILCP and IGCP to guide supervised contrastive learning. Moreover, we also propose two modules named Average SuperPixel Feature Map Generation (ASP) and Connected Components Label Generation (CCL) to better exploit the prior structural information for IGCP. Finally, experiments on 8 medical image datasets indicate our SuperCL outperforms existing 12 methods. i.e. Our SuperCL achieves a superior performance with more precise predictions from visualization figures and 3.15%, 5.44%, 7.89% DSC higher than the previous best results on MMWHS, CHAOS, Spleen with 10% annotations. Our code will be released after acceptance.

Paper Structure

This paper contains 12 sections, 9 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Our SuperCL (solid red lines) achieves SOTA segmentation performance (DSC) compared with other 7 CL baselines (dashed lines) across 4 multi-organ (red) and 4 ROI-based datasets (blue) with 25% annotations.
  • Figure 2: Overview of our proposed SuperCL. The image is input into two branches with two different augmentation settings: (1) The spatial invariance group is firstly fitted into the encoder to get the feature map. Then the feature map is flatten to get pixel-level projection. Finally guided with the superpixel pseudo mask (generated from SLIC and flatten), the pixel-level projection will be used for optimizing $\mathcal{L}_{intra}$ with ILCP (Sect. B). (2) The spatial variance group is propagated into the encoder along with a projector to get the instance-level projection. Then guided with a weak label, the instance-level projection will be used for optimizing $\mathcal{L}_{inter}$ with IGCP (Sect. C). Notably, the weak label is generated from the feature map and superpixel pseudo mask with two proposed modules: ASP and CCL (Sect. C). And $\mathcal{L}_{ins}$ functions as a baseline loss originated from WCL or PCL.
  • Figure 3: Superpixel-guided intra-image local contrastive pairs generation.
  • Figure 4: Illustration of our proposed ASP and CCL modules. (a) ASP aims at generating a more reliable representation for affinity matrix calculation. (b) CCL aims at transforming the representation into the weak label used for supervised CL.