Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation
Jiachen Liang, Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen
TL;DR
SSFA tackles Feature Distribution Mismatch SSL (FDM-SSL) by decoupling pseudo-label generation from the current model and introducing a self-supervised feature adaptation step. The framework combines a semi-supervised learning module with a self-supervised auxiliary task and a one-shot feature adaptation on unlabeled data to produce higher-quality pseudo-labels, improving performance across labeled, unlabeled, and unseen distributions. Theoretical insights link the method to gradient alignment between main and auxiliary tasks, and empirical results on CIFAR/CIFAR-C and office datasets show robust gains over both SSL and UDA baselines, along with strong visualization-supported evidence of improved domain alignment and class separability. Overall, SSFA provides a practical, plug-in approach to extend SSL to realistic, mixed-distribution scenarios with improved generalization.
Abstract
Traditional semi-supervised learning (SSL) assumes that the feature distributions of labeled and unlabeled data are consistent which rarely holds in realistic scenarios. In this paper, we propose a novel SSL setting, where unlabeled samples are drawn from a mixed distribution that deviates from the feature distribution of labeled samples. Under this setting, previous SSL methods tend to predict wrong pseudo-labels with the model fitted on labeled data, resulting in noise accumulation. To tackle this issue, we propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions. SSFA decouples the prediction of pseudo-labels from the current model to improve the quality of pseudo-labels. Particularly, SSFA incorporates a self-supervised task into the SSL framework and uses it to adapt the feature extractor of the model to the unlabeled data. In this way, the extracted features better fit the distribution of unlabeled data, thereby generating high-quality pseudo-labels. Extensive experiments show that our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.
