Table of Contents
Fetching ...

Image-Feature Weak-to-Strong Consistency: An Enhanced Paradigm for Semi-Supervised Learning

Zhiyu Wu, Jinshi Cui

TL;DR

This work targets the limitation of SSL methods that rely solely on image-level perturbations by introducing Image-Feature Weak-to-Strong Consistency (IFMatch), which adds feature-level perturbations to expand augmentation space. It designs two perturbation positions within residual blocks and three strategies ('movement','dropout','value'), enabling rich, sample-agnostic augmentations, and integrates them in a triple-branch architecture: a teacher branch and two student branches that fuse image- and feature-level perturbations in complementary ways. A confidence-based identification strategy selectively applies weak feature perturbations to naive samples in the second branch, while preserving compatibility with existing thresholds for the other branch; the overall objective is $\\mathcal{L} = \\\mathcal{L}_s + \\\lambda_u (\\mathcal{L}_{u_1} + \\\mathcal{L}_{u_2})$, with branch-specific unsupervised losses. Empirically, IFMatch yields consistent improvements across balanced and imbalanced SSL benchmarks and extends to semi-supervised segmentation, reducing reliance on complex threshold dynamics as augmentation space expands and highlighting the practical impact of richer perturbation design on unlabeled data utilization.

Abstract

Image-level weak-to-strong consistency serves as the predominant paradigm in semi-supervised learning~(SSL) due to its simplicity and impressive performance. Nonetheless, this approach confines all perturbations to the image level and suffers from the excessive presence of naive samples, thus necessitating further improvement. In this paper, we introduce feature-level perturbation with varying intensities and forms to expand the augmentation space, establishing the image-feature weak-to-strong consistency paradigm. Furthermore, our paradigm develops a triple-branch structure, which facilitates interactions between both types of perturbations within one branch to boost their synergy. Additionally, we present a confidence-based identification strategy to distinguish between naive and challenging samples, thus introducing additional challenges exclusively for naive samples. Notably, our paradigm can seamlessly integrate with existing SSL methods. We apply the proposed paradigm to several representative algorithms and conduct experiments on multiple benchmarks, including both balanced and imbalanced distributions for labeled samples. The results demonstrate a significant enhancement in the performance of existing SSL algorithms.

Image-Feature Weak-to-Strong Consistency: An Enhanced Paradigm for Semi-Supervised Learning

TL;DR

This work targets the limitation of SSL methods that rely solely on image-level perturbations by introducing Image-Feature Weak-to-Strong Consistency (IFMatch), which adds feature-level perturbations to expand augmentation space. It designs two perturbation positions within residual blocks and three strategies ('movement','dropout','value'), enabling rich, sample-agnostic augmentations, and integrates them in a triple-branch architecture: a teacher branch and two student branches that fuse image- and feature-level perturbations in complementary ways. A confidence-based identification strategy selectively applies weak feature perturbations to naive samples in the second branch, while preserving compatibility with existing thresholds for the other branch; the overall objective is , with branch-specific unsupervised losses. Empirically, IFMatch yields consistent improvements across balanced and imbalanced SSL benchmarks and extends to semi-supervised segmentation, reducing reliance on complex threshold dynamics as augmentation space expands and highlighting the practical impact of richer perturbation design on unlabeled data utilization.

Abstract

Image-level weak-to-strong consistency serves as the predominant paradigm in semi-supervised learning~(SSL) due to its simplicity and impressive performance. Nonetheless, this approach confines all perturbations to the image level and suffers from the excessive presence of naive samples, thus necessitating further improvement. In this paper, we introduce feature-level perturbation with varying intensities and forms to expand the augmentation space, establishing the image-feature weak-to-strong consistency paradigm. Furthermore, our paradigm develops a triple-branch structure, which facilitates interactions between both types of perturbations within one branch to boost their synergy. Additionally, we present a confidence-based identification strategy to distinguish between naive and challenging samples, thus introducing additional challenges exclusively for naive samples. Notably, our paradigm can seamlessly integrate with existing SSL methods. We apply the proposed paradigm to several representative algorithms and conduct experiments on multiple benchmarks, including both balanced and imbalanced distributions for labeled samples. The results demonstrate a significant enhancement in the performance of existing SSL algorithms.
Paper Structure (25 sections, 13 equations, 9 figures, 10 tables, 1 algorithm)

This paper contains 25 sections, 13 equations, 9 figures, 10 tables, 1 algorithm.

Figures (9)

  • Figure 1: Overview of the old (left) and proposed (right) paradigms. Our paradigm introduces feature-level perturbation to expand the augmentation space and facilitates direct interactions between both types of perturbations to boost their synergy.
  • Figure 2: Candidate positions for introducing feature-level perturbation. Position A perturbs the output of the residual block, representing strong feature-level perturbation $\mathcal{A}^{\mathcal{F}_s}$. Position B perturbs the output of a random convolution within the residual component, corresponding to weak feature-level perturbation $\mathcal{A}^{\mathcal{F}_w}$. $x^{in} / x^{out}$ denotes input/output feature maps of the residual block.
  • Figure 3: Overview of the confidence-based identification strategy. The approach records sample-wise target confidence in the second student branch and identifies naive samples by comparing the confidence with the threshold in the second student branch. w/ and w/o abbreviates with and without, respectively.
  • Figure 4: Visualization for naive sample ratio and the identification process of SAA gui2023enhancing and confidence-based identification on CIFAR-100-400.
  • Figure 5: Visualization of perturbations. We provide the feature map for samples that undergo feature-level augmentation. A naive sample $u$ exhibits the characteristic of $\forall \mathcal{A}^{\mathcal{I}_s}(u), \mathcal{H}(\hat{p}^{\mathcal{I}_w}_i, p^{\mathcal{I}_s}_i) \approx 0$, thus contributing trivially to the model's performance.
  • ...and 4 more figures