Table of Contents
Fetching ...

Enhancing Semi-supervised Learning with Zero-shot Pseudolabels

Jichan Chung, Irene Y. Chen

TL;DR

ZeroMatch tackles the high cost of data labeling by unifying foundation-model-derived pseudo-labels with semi-supervised learning through a two-stage approach. It first distills knowledge from a foundation model into a compact student using $L_{KD}$, then trains with SSL while employing an auxiliary KD objective $L_{KD_2}$ to prevent forgetting, yielding $L_{KD-SSL} = L_s + L_u + \alpha_t \lambda_p L_{KD_2}$. The method demonstrates strong, robust gains across six vision and language benchmarks and remains effective across varying foundation-model qualities, enabling on-device or low-resource training by avoiding full FM fine-tuning. This work effectively shows that combining KD and SSL can exploit zero-shot supervision from foundation models to deliver practical, resource-efficient learning without sacrificing performance.

Abstract

The high cost of data labeling presents a major barrier to deploying machine learning systems at scale. Semi-supervised learning (SSL) mitigates this challenge by utilizing unlabeled data alongside limited labeled examples, while the emergence of foundation models (FMs) offers powerful zero-shot capabilities that can further reduce labeling cost. However, directly fine-tuning large FMs is often impractical in resource-constrained settings, and naïvely using their pseudo-labels for unlabeled data can degrade performance due to its unreliablity or domain mismatch with target task. In this work, we introduce ZeroMatch, a novel SSL framework that integrates knowledge distillation with consistency-based learning to jointly leverage labeled data, unlabeled data, and pseudo-labels from FMs. ZeroMatch enables training compact student models using only FM inference, making it suitable for low-resource environments such as personal devices with limited compute. Experiments on six vision and language classification benchmarks show that ZeroMatch consistently outperforms standard SSL and zero-shot augmented methods, demonstrating its effectiveness and robustness across a range of foundation model qualities.

Enhancing Semi-supervised Learning with Zero-shot Pseudolabels

TL;DR

ZeroMatch tackles the high cost of data labeling by unifying foundation-model-derived pseudo-labels with semi-supervised learning through a two-stage approach. It first distills knowledge from a foundation model into a compact student using , then trains with SSL while employing an auxiliary KD objective to prevent forgetting, yielding . The method demonstrates strong, robust gains across six vision and language benchmarks and remains effective across varying foundation-model qualities, enabling on-device or low-resource training by avoiding full FM fine-tuning. This work effectively shows that combining KD and SSL can exploit zero-shot supervision from foundation models to deliver practical, resource-efficient learning without sacrificing performance.

Abstract

The high cost of data labeling presents a major barrier to deploying machine learning systems at scale. Semi-supervised learning (SSL) mitigates this challenge by utilizing unlabeled data alongside limited labeled examples, while the emergence of foundation models (FMs) offers powerful zero-shot capabilities that can further reduce labeling cost. However, directly fine-tuning large FMs is often impractical in resource-constrained settings, and naïvely using their pseudo-labels for unlabeled data can degrade performance due to its unreliablity or domain mismatch with target task. In this work, we introduce ZeroMatch, a novel SSL framework that integrates knowledge distillation with consistency-based learning to jointly leverage labeled data, unlabeled data, and pseudo-labels from FMs. ZeroMatch enables training compact student models using only FM inference, making it suitable for low-resource environments such as personal devices with limited compute. Experiments on six vision and language classification benchmarks show that ZeroMatch consistently outperforms standard SSL and zero-shot augmented methods, demonstrating its effectiveness and robustness across a range of foundation model qualities.

Paper Structure

This paper contains 67 sections, 8 equations, 1 figure, 18 tables.

Figures (1)

  • Figure 1: Illustration of our 2-stage ZeroMatch algorithm. Both labeled and unlabeled input data receive pseudo-labels from foundation models. In stage 1, knowledge distillation is performed with pseudo-labels from teacher foundation model. In stage 2, the model is trained with supervised and unsupervised loss of SSL algorithm, with weights and initial confident prediction learned from the previous stage. We add an auxiliary classifier head (green box) that runs knowledge distillation task, to reduce potential catastrophic forgetting that may occur during runing SSL.