A Review of Pseudo-Labeling for Computer Vision

Patrick Kage; Jay C. Rothenberger; Pavlos Andreadis; Dimitrios I. Diochnos

A Review of Pseudo-Labeling for Computer Vision

Patrick Kage, Jay C. Rothenberger, Pavlos Andreadis, Dimitrios I. Diochnos

TL;DR

This survey unifies pseudo-labeling across semi-supervised, unsupervised, and self-supervised learning in computer vision by introducing a fuzzy-partition view of pseudo-labels. It catalogs a broad taxonomy of PL methods, including sample scheduling, weak supervision, consistency regularization, multi-model approaches, and knowledge distillation, and discusses how these ideas intersect through data filtering, curricula, and augmentation strategies. The paper highlights robustness to label noise and demonstrates how PL concepts permeate both SSL and UL methods, offering directions such as meta-learning for label assignment and self-supervised regularization to stabilize training. By synthesizing these threads, the work provides a framework for transferring advances across SSL, UL, and distillation, with practical implications for data-efficient learning in domains with scarce labels.

Abstract

Deep neural models have achieved state of the art performance on a wide range of problems in computer science, especially in computer vision. However, deep neural networks often require large datasets of labeled samples to generalize effectively, and an important area of active research is semi-supervised learning, which attempts to instead utilize large quantities of (easily acquired) unlabeled samples. One family of methods in this space is pseudo-labeling, a class of algorithms that use model outputs to assign labels to unlabeled samples which are then used as labeled samples during training. Such assigned labels, called pseudo-labels, are most commonly associated with the field of semi-supervised learning. In this work we explore a broader interpretation of pseudo-labels within both self-supervised and unsupervised methods. By drawing the connection between these areas we identify new directions when advancements in one area would likely benefit others, such as curriculum learning and self-supervised regularization.

A Review of Pseudo-Labeling for Computer Vision

TL;DR

Abstract

Paper Structure (38 sections, 11 equations, 4 figures, 1 table)

This paper contains 38 sections, 11 equations, 4 figures, 1 table.

Introduction
Pseudo-labels within semi-supervised learning.
Pseudo-labels within unsupervised learning.
Pseudo-labeling techniques are tolerant to label noise.
Outline.
Preliminaries
Fuzzy Partitions
Basic Machine Learning Notation
Deeper Discussion on Labels
Data Augmentation
Semi-Supervised Regimes
Sample Scheduling
Random Selection
Confidence-Based Selection
Curriculum Learning
...and 23 more sections

Figures (4)

Figure 1: Family tree of pseudo-labeling methods. Please see Appendix \ref{['appendix:tables']} for a table with section links and references. In Table \ref{['tab:pl_benchmarks']} we give a performance comparison of relevant methods.
Figure 2: A simplified diagram of Multi-Head Co-Training's architecture. Adapted from MHCotraining.
Figure 3: A figure describing Meta Pseudo Label's overall architecture. Adapted from MetaPL.
Figure 4: A chart showing how general pairwise consistency regularization works. Generally, samples are drawn from $\mathbf{x}\xspace \sim \mathcal{D}$, where a series of transformations $t\sim\mathcal{T}$ transforms $\mathbf{x}\xspace$ into a pair of unequally augmented samples $\tilde{\mathbf{x}\xspace}_1,\tilde{\mathbf{x}\xspace}_2$. This pair of samples is then projected into a latent representation by some projection function $f(\,\cdot\,)$ resulting in latent vectors $\mathbf{z}\xspace_1,\mathbf{z}\xspace_2$. Typically during consistency regularization, the loss function will be structured to minimize the distance from $\mathbf{z}\xspace_1,\mathbf{z}\xspace_2$ from the same sample $\mathbf{x}\xspace$ and maximize the distance from other sample representations within a minibatch.

Theorems & Definitions (8)

Example 1: Illustration of a Fuzzy Set
Definition 1: Fuzzy Partition
Example 2: Illustration of a Fuzzy Partition
Definition 2: Stochastic Labels
Remark 1: Stochastic Labels from Fuzzy Partitions
Definition 3: Pseudo-Labels
Definition 4: Pseudo-Labeling
Definition 5: Pseudo-Examples

A Review of Pseudo-Labeling for Computer Vision

TL;DR

Abstract

A Review of Pseudo-Labeling for Computer Vision

Authors

TL;DR

Abstract

Table of Contents

Figures (4)

Theorems & Definitions (8)