Table of Contents
Fetching ...

SCALER: SAM-Enhanced Collaborative Learning for Label-Deficient Concealed Object Segmentation

Chunming He, Rihan Zhang, Longxiang Tang, Ziyun Yang, Kai Li, Deng-Ping Fan, Sina Farsiu

TL;DR

Label-deficient concealed object segmentation (LDCOS) faces severe annotation scarcity and obscured targets. SCALER couples a mean-teacher segmenter with a learnable SAM in two alternating phases to enable bidirectional knowledge transfer: Phase I refines the segmenter using entropy- and uncertainty-weighted pseudo-labels from both the teacher and SAM, while Phase II adapts SAM with augmentation invariance and noise-resistance losses guided by the segmenter’s feedback. Across eight semi- and weakly-supervised COS tasks, SCALER consistently outperforms prior one-way distillation methods and demonstrates strong generalization to other foundation-model settings. This bi-directional collaboration offers a practical pathway to robust COS under limited labels and suggests broader applicability to other label-scarce vision tasks.

Abstract

Existing methods for label-deficient concealed object segmentation (LDCOS) either rely on consistency constraints or Segment Anything Model (SAM)-based pseudo-labeling. However, their performance remains limited due to the intrinsic concealment of targets and the scarcity of annotations. This study investigates two key questions: (1) Can consistency constraints and SAM-based supervision be jointly integrated to better exploit complementary information and enhance the segmenter? and (2) beyond that, can the segmenter in turn guide SAM through reciprocal supervision, enabling mutual improvement? To answer these questions, we present SCALER, a unified collaborative framework toward LDCOS that jointly optimizes a mean-teacher segmenter and a learnable SAM. SCALER operates in two alternating phases. In \textbf{Phase \uppercase\expandafter{\romannumeral1}}, the segmenter is optimized under fixed SAM supervision using entropy-based image-level and uncertainty-based pixel-level weighting to select reliable pseudo-label regions and emphasize harder examples. In \textbf{Phase \uppercase\expandafter{\romannumeral2}}, SAM is updated via augmentation invariance and noise resistance losses, leveraging its inherent robustness to perturbations. Experiments demonstrate that SCALER yields consistent performance gains across eight semi- and weakly-supervised COS tasks. The results further suggest that SCALER can serve as a general training paradigm to enhance both lightweight segmenters and large foundation models under label-scarce conditions. Code will be released.

SCALER: SAM-Enhanced Collaborative Learning for Label-Deficient Concealed Object Segmentation

TL;DR

Label-deficient concealed object segmentation (LDCOS) faces severe annotation scarcity and obscured targets. SCALER couples a mean-teacher segmenter with a learnable SAM in two alternating phases to enable bidirectional knowledge transfer: Phase I refines the segmenter using entropy- and uncertainty-weighted pseudo-labels from both the teacher and SAM, while Phase II adapts SAM with augmentation invariance and noise-resistance losses guided by the segmenter’s feedback. Across eight semi- and weakly-supervised COS tasks, SCALER consistently outperforms prior one-way distillation methods and demonstrates strong generalization to other foundation-model settings. This bi-directional collaboration offers a practical pathway to robust COS under limited labels and suggests broader applicability to other label-scarce vision tasks.

Abstract

Existing methods for label-deficient concealed object segmentation (LDCOS) either rely on consistency constraints or Segment Anything Model (SAM)-based pseudo-labeling. However, their performance remains limited due to the intrinsic concealment of targets and the scarcity of annotations. This study investigates two key questions: (1) Can consistency constraints and SAM-based supervision be jointly integrated to better exploit complementary information and enhance the segmenter? and (2) beyond that, can the segmenter in turn guide SAM through reciprocal supervision, enabling mutual improvement? To answer these questions, we present SCALER, a unified collaborative framework toward LDCOS that jointly optimizes a mean-teacher segmenter and a learnable SAM. SCALER operates in two alternating phases. In \textbf{Phase \uppercase\expandafter{\romannumeral1}}, the segmenter is optimized under fixed SAM supervision using entropy-based image-level and uncertainty-based pixel-level weighting to select reliable pseudo-label regions and emphasize harder examples. In \textbf{Phase \uppercase\expandafter{\romannumeral2}}, SAM is updated via augmentation invariance and noise resistance losses, leveraging its inherent robustness to perturbations. Experiments demonstrate that SCALER yields consistent performance gains across eight semi- and weakly-supervised COS tasks. The results further suggest that SCALER can serve as a general training paradigm to enhance both lightweight segmenters and large foundation models under label-scarce conditions. Code will be released.

Paper Structure

This paper contains 14 sections, 19 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: Results of existing LDCOS methods with point supervision, including SCOD he2022weakly, WS-SAM he2023weaklysupervised, and SEE he2025segment. The suffix "+" denotes integration with SCALER. SCALER yields more accurate concealed‑object segmentation and achieves leading results across COD (camouflaged object detection), PIS (polyp image segmentation), and TOD (transparent object detection) under weak (WS) and semi‑supervised (SS) settings. In the top section, concealed objects masks are highlighted in pink and blue.
  • Figure 2: Quality of pseudo-labels from different SAM variants.
  • Figure 3: The bi-directional collaborative learning framework of SCALER. The framework alternates between two optimization phases, which enables the models to mutually enhance each other. Notably, our framework supports the integration of various existing segmenters and SAM/SAM2 fine-tuning approaches. Augmentation strategies are randomly sampled to further enhance training flexibility.
  • Figure 4: Visualizations for COS tasks with point supervision.