Table of Contents
Fetching ...

Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification

Xuerong Zhang, Li Huang, Jing Lv, Ming Yang

TL;DR

This work addresses the instability and underutilization inherent in pseudo-labeling based semi-supervised image classification. It introduces STUC-SSIC, a framework that couples Self Adaptive Threshold Pseudo-labeling (SATPL) with Unreliable Sample Contrastive Loss (USCL) to both expand the pool of reliable pseudo-labels and extract discriminative information from low-confidence unlabeled data. SATPL dynamically adjusts global and local class thresholds as training progresses, enabling more pseudo-labels early on and better quality later, while USCL leverages unreliable samples through a class-aware contrastive objective to speed up convergence. Empirical results on CIFAR-10, CIFAR-100, and STL-10 show that STUC-SSIC achieves superior accuracy and faster convergence than strong SSL baselines, validating its effectiveness in utilizing unlabeled data more fully and efficiently.

Abstract

Semi-supervised learning is attracting blooming attention, due to its success in combining unlabeled data. However, pseudo-labeling-based semi-supervised approaches suffer from two problems in image classification: (1) Existing methods might fail to adopt suitable thresholds since they either use a pre-defined/fixed threshold or an ad-hoc threshold adjusting scheme, resulting in inferior performance and slow convergence. (2) Discarding unlabeled data with confidence below the thresholds results in the loss of discriminating information. To solve these issues, we develop an effective method to make sufficient use of unlabeled data. Specifically, we design a self adaptive threshold pseudo-labeling strategy, which thresholds for each class can be dynamically adjusted to increase the number of reliable samples. Meanwhile, in order to effectively utilise unlabeled data with confidence below the thresholds, we propose an unreliable sample contrastive loss to mine the discriminative information in low-confidence samples by learning the similarities and differences between sample features. We evaluate our method on several classification benchmarks under partially labeled settings and demonstrate its superiority over the other approaches.

Self Adaptive Threshold Pseudo-labeling and Unreliable Sample Contrastive Loss for Semi-supervised Image Classification

TL;DR

This work addresses the instability and underutilization inherent in pseudo-labeling based semi-supervised image classification. It introduces STUC-SSIC, a framework that couples Self Adaptive Threshold Pseudo-labeling (SATPL) with Unreliable Sample Contrastive Loss (USCL) to both expand the pool of reliable pseudo-labels and extract discriminative information from low-confidence unlabeled data. SATPL dynamically adjusts global and local class thresholds as training progresses, enabling more pseudo-labels early on and better quality later, while USCL leverages unreliable samples through a class-aware contrastive objective to speed up convergence. Empirical results on CIFAR-10, CIFAR-100, and STL-10 show that STUC-SSIC achieves superior accuracy and faster convergence than strong SSL baselines, validating its effectiveness in utilizing unlabeled data more fully and efficiently.

Abstract

Semi-supervised learning is attracting blooming attention, due to its success in combining unlabeled data. However, pseudo-labeling-based semi-supervised approaches suffer from two problems in image classification: (1) Existing methods might fail to adopt suitable thresholds since they either use a pre-defined/fixed threshold or an ad-hoc threshold adjusting scheme, resulting in inferior performance and slow convergence. (2) Discarding unlabeled data with confidence below the thresholds results in the loss of discriminating information. To solve these issues, we develop an effective method to make sufficient use of unlabeled data. Specifically, we design a self adaptive threshold pseudo-labeling strategy, which thresholds for each class can be dynamically adjusted to increase the number of reliable samples. Meanwhile, in order to effectively utilise unlabeled data with confidence below the thresholds, we propose an unreliable sample contrastive loss to mine the discriminative information in low-confidence samples by learning the similarities and differences between sample features. We evaluate our method on several classification benchmarks under partially labeled settings and demonstrate its superiority over the other approaches.
Paper Structure (24 sections, 13 equations, 3 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 13 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Illustration of STUC-SSIC. Weak data augmentation labeled data (top) with true labels constitutes a supervised loss.For unlabeled data (bottom), self-adaptive threshold pseudo-labeling (SATPL) generate the current local threshold, pseudo-labels are generated only when the class probability of weak data augmented samples is higher than the local threshold. The prediction of strong data augmented samples with pseudo-labels constitute an unsupervised loss. Unreliable samples (samples below the thresholds) construct a new contrastive loss for training.
  • Figure 2: Ablation Study of SATPL on STL-10 with 40 labels, compared to previous methods. (a) Class-average confidence threshold; (b) Mask ratio; (c) Confusion matrix, where the fading color of diagonal elements refers to the disparity of the accuracy.
  • Figure 3: Ablation study of different $\varepsilon_{1}$ and $\varepsilon_{2}$ on CIFAR-10 with 40 labels.