Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

Jiachen Liang; Ruibing Hou; Hong Chang; Bingpeng Ma; Shiguang Shan; Xilin Chen

Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

Jiachen Liang, Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen

TL;DR

SSFA tackles Feature Distribution Mismatch SSL (FDM-SSL) by decoupling pseudo-label generation from the current model and introducing a self-supervised feature adaptation step. The framework combines a semi-supervised learning module with a self-supervised auxiliary task and a one-shot feature adaptation on unlabeled data to produce higher-quality pseudo-labels, improving performance across labeled, unlabeled, and unseen distributions. Theoretical insights link the method to gradient alignment between main and auxiliary tasks, and empirical results on CIFAR/CIFAR-C and office datasets show robust gains over both SSL and UDA baselines, along with strong visualization-supported evidence of improved domain alignment and class separability. Overall, SSFA provides a practical, plug-in approach to extend SSL to realistic, mixed-distribution scenarios with improved generalization.

Abstract

Traditional semi-supervised learning (SSL) assumes that the feature distributions of labeled and unlabeled data are consistent which rarely holds in realistic scenarios. In this paper, we propose a novel SSL setting, where unlabeled samples are drawn from a mixed distribution that deviates from the feature distribution of labeled samples. Under this setting, previous SSL methods tend to predict wrong pseudo-labels with the model fitted on labeled data, resulting in noise accumulation. To tackle this issue, we propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions. SSFA decouples the prediction of pseudo-labels from the current model to improve the quality of pseudo-labels. Particularly, SSFA incorporates a self-supervised task into the SSL framework and uses it to adapt the feature extractor of the model to the unlabeled data. In this way, the extracted features better fit the distribution of unlabeled data, thereby generating high-quality pseudo-labels. Extensive experiments show that our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.

Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

TL;DR

Abstract

Paper Structure (14 sections, 1 theorem, 10 equations, 5 figures, 6 tables)

This paper contains 14 sections, 1 theorem, 10 equations, 5 figures, 6 tables.

Introduction
Related work
Problem Setting
Method
Overview
Semi-Supervised Learning Module
Feature Adaptation Module
Theoretical Insights
Experiments
Experimental Setting
Main Results
Feature Visualization
Ablation Study
Conclusion

Key Result

Lemma 1

Assume that for all $x, y$, $\ell_m(x, y; h)$ is differentiable, convex and $\beta$-smooth in $h$, and both $\| \nabla \ell_m(x, y; h)\|$, $\| \nabla \ell_s(x; h)\| \leq G$ for all $h \in \mathcal{H}$. With a fixed learning rate $\eta=\frac{\epsilon}{\beta G^2}$, for every $x$, $y$ such that $\la where $h'$ is the updated hypothesis, namely $h' = h - \eta \nabla \ell_{s}(x;h)$.

Figures (5)

Figure 1: The pipeline of SSFA. Let $x$, $u_w$ and $u_s$ denote a batch of the labeled data, the weak augmentation and the strong augmentation of unlabeled data respectively, $\{\cdot\}$ represent the data stream.
Figure 2: Scatter plot of the gradient inner product between the two tasks, and the improvement from SSFA. We transform the x-axis with $\log(x)+1$ for clarity.
Figure 3: Visualization of domain-level features using different methods, where "label" represents the labeled data drawn from the labeled domain, and "unlabel0" to "unlabel9" represent the unlabeled data drawn from ten unlabeled domains respectively.
Figure 4: Visualization of class-level features using different methods.
Figure 5: The impact of $\tau$ for different SSL models on OFFICE-31 ("A/W" task).

Theorems & Definitions (1)

Lemma 1: refer18

Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

TL;DR

Abstract

Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (1)