Table of Contents
Fetching ...

Safe Semi-Supervised Contrastive Learning Using In-Distribution Data as Positive Examples

Min Gu Kwak, Hyungu Kahng, Seoung Bum Kim

TL;DR

This work tackles the practical problem of class distribution mismatch in semi-supervised learning by proposing Safe Semi-Supervised Contrastive Learning (SSCL), which leverages in-distribution data as additional positives within a MoCo-based self-supervised contrastive framework. A novel loss, L_i^{ID}, reuses labeled negatives of the same class as positives, and a coefficient schedule w(t) gradually reduces its influence to prevent overfitting, while a memory queue preserves class information for ID-aware sampling. Empirical results on CIFAR-10, CIFAR-100, Tiny ImageNet, and CIFAR-100+Tiny ImageNet under varied mismatch ratios show that SSCL improves representation quality and downstream classification, often outperforming strong baselines and safe SSL methods, with larger gains in challenging scenarios. The approach demonstrates that incorporating ID information through selective positive augmentation and a principled schedule yields robust, scalable improvements without discarding unlabeled OOD data, and suggests avenues for adaptive scheduling and stronger augmentations in future work.

Abstract

Semi-supervised learning methods have shown promising results in solving many practical problems when only a few labels are available. The existing methods assume that the class distributions of labeled and unlabeled data are equal; however, their performances are significantly degraded in class distribution mismatch scenarios where out-of-distribution (OOD) data exist in the unlabeled data. Previous safe semi-supervised learning studies have addressed this problem by making OOD data less likely to affect training based on labeled data. However, even if the studies effectively filter out the unnecessary OOD data, they can lose the basic information that all data share regardless of class. To this end, we propose to apply a self-supervised contrastive learning approach to fully exploit a large amount of unlabeled data. We also propose a contrastive loss function with coefficient schedule to aggregate as an anchor the labeled negative examples of the same class into positive examples. To evaluate the performance of the proposed method, we conduct experiments on image classification datasets - CIFAR-10, CIFAR-100, Tiny ImageNet, and CIFAR-100+Tiny ImageNet - under various mismatch ratios. The results show that self-supervised contrastive learning significantly improves classification accuracy. Moreover, aggregating the in-distribution examples produces better representation and consequently further improves classification accuracy.

Safe Semi-Supervised Contrastive Learning Using In-Distribution Data as Positive Examples

TL;DR

This work tackles the practical problem of class distribution mismatch in semi-supervised learning by proposing Safe Semi-Supervised Contrastive Learning (SSCL), which leverages in-distribution data as additional positives within a MoCo-based self-supervised contrastive framework. A novel loss, L_i^{ID}, reuses labeled negatives of the same class as positives, and a coefficient schedule w(t) gradually reduces its influence to prevent overfitting, while a memory queue preserves class information for ID-aware sampling. Empirical results on CIFAR-10, CIFAR-100, Tiny ImageNet, and CIFAR-100+Tiny ImageNet under varied mismatch ratios show that SSCL improves representation quality and downstream classification, often outperforming strong baselines and safe SSL methods, with larger gains in challenging scenarios. The approach demonstrates that incorporating ID information through selective positive augmentation and a principled schedule yields robust, scalable improvements without discarding unlabeled OOD data, and suggests avenues for adaptive scheduling and stronger augmentations in future work.

Abstract

Semi-supervised learning methods have shown promising results in solving many practical problems when only a few labels are available. The existing methods assume that the class distributions of labeled and unlabeled data are equal; however, their performances are significantly degraded in class distribution mismatch scenarios where out-of-distribution (OOD) data exist in the unlabeled data. Previous safe semi-supervised learning studies have addressed this problem by making OOD data less likely to affect training based on labeled data. However, even if the studies effectively filter out the unnecessary OOD data, they can lose the basic information that all data share regardless of class. To this end, we propose to apply a self-supervised contrastive learning approach to fully exploit a large amount of unlabeled data. We also propose a contrastive loss function with coefficient schedule to aggregate as an anchor the labeled negative examples of the same class into positive examples. To evaluate the performance of the proposed method, we conduct experiments on image classification datasets - CIFAR-10, CIFAR-100, Tiny ImageNet, and CIFAR-100+Tiny ImageNet - under various mismatch ratios. The results show that self-supervised contrastive learning significantly improves classification accuracy. Moreover, aggregating the in-distribution examples produces better representation and consequently further improves classification accuracy.
Paper Structure (13 sections, 5 equations, 6 figures, 7 tables)

This paper contains 13 sections, 5 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: An example of when the identical class assumption is violated in SSL. OOD data (car and flower indicated with red box) exists in unlabeled data that are not present in the labeled data.
  • Figure 2: Simplified semantic data structure of natural images. Two key features exist: (1) class-level information (e.g., a dog's nose, a cat's ears) and (2) image-level information (e.g., backgrounds, textures). The red area indicates potential loss of class-level information when OOD data is excluded.
  • Figure 3: Graphical overview of the proposed method that leverages the labeled data among negative examples in a memory queue as positive examples.
  • Figure 4: Learned representations on an L2-normalized unit hypersphere. (a) A hypersphere that classes are linearly separable; (b) A hypersphere with additional characteristics to keep ID and OOD apart.
  • Figure 5: t-SNE visualization of learned representations. (a) Vanilla MoCo; (b) The proposed loss function without schedule added; (c) The proposed method with schedule added. Colored, lightly colored, and grey points refer to labeled ID, unlabeled ID, and unlabeled OOD data, respectively.
  • ...and 1 more figures