SuperCL: Superpixel Guided Contrastive Learning for Medical Image Segmentation Pre-training
Shuang Zeng, Lei Zhu, Xinliang Zhang, Hangzhou He, Yanye Lu
TL;DR
Medical image segmentation is hindered by scarce labeled data. SuperCL addresses this by pre-training with contrastive learning guided by superpixels: ILCP exploits pixel-level intra-image structure, while IGCP leverages inter-image global relationships via ASP and CCL to generate reliable weak labels; the overall objective combines $ \mathcal{L}_{total}=\lambda_1\mathcal{L}_{ins}+\lambda_2\mathcal{L}_{intra}+\lambda_3\mathcal{L}_{inter}$. Empirical results on eight downstream datasets show that SuperCL outperforms 12 SOTA CL baselines, achieving notable DSC and JC gains at 10% and maintaining improvements at 25% annotations; in some cases, 25% supervision with SuperCL approaches fully supervised performance. The approach demonstrates strong cross-domain generalization (CT/MRI) and is compatible with various U-Net backbones, underscoring its practical value for data-efficient medical image segmentation. Overall, SuperCL provides a scalable, annotation-efficient pre-training paradigm that improves segmentation accuracy and reliability in clinical imaging workflows.
Abstract
Medical image segmentation is a critical yet challenging task, primarily due to the difficulty of obtaining extensive datasets of high-quality, expert-annotated images. Contrastive learning presents a potential but still problematic solution to this issue. Because most existing methods focus on extracting instance-level or pixel-to-pixel representation, which ignores the characteristics between intra-image similar pixel groups. Moreover, when considering contrastive pairs generation, most SOTA methods mainly rely on manually setting thresholds, which requires a large number of gradient experiments and lacks efficiency and generalization. To address these issues, we propose a novel contrastive learning approach named SuperCL for medical image segmentation pre-training. Specifically, our SuperCL exploits the structural prior and pixel correlation of images by introducing two novel contrastive pairs generation strategies: Intra-image Local Contrastive Pairs (ILCP) Generation and Inter-image Global Contrastive Pairs (IGCP) Generation. Considering superpixel cluster aligns well with the concept of contrastive pairs generation, we utilize the superpixel map to generate pseudo masks for both ILCP and IGCP to guide supervised contrastive learning. Moreover, we also propose two modules named Average SuperPixel Feature Map Generation (ASP) and Connected Components Label Generation (CCL) to better exploit the prior structural information for IGCP. Finally, experiments on 8 medical image datasets indicate our SuperCL outperforms existing 12 methods. i.e. Our SuperCL achieves a superior performance with more precise predictions from visualization figures and 3.15%, 5.44%, 7.89% DSC higher than the previous best results on MMWHS, CHAOS, Spleen with 10% annotations. Our code will be released after acceptance.
