Enhancing Semi-supervised Learning with Zero-shot Pseudolabels
Jichan Chung, Irene Y. Chen
TL;DR
ZeroMatch tackles the high cost of data labeling by unifying foundation-model-derived pseudo-labels with semi-supervised learning through a two-stage approach. It first distills knowledge from a foundation model into a compact student using $L_{KD}$, then trains with SSL while employing an auxiliary KD objective $L_{KD_2}$ to prevent forgetting, yielding $L_{KD-SSL} = L_s + L_u + \alpha_t \lambda_p L_{KD_2}$. The method demonstrates strong, robust gains across six vision and language benchmarks and remains effective across varying foundation-model qualities, enabling on-device or low-resource training by avoiding full FM fine-tuning. This work effectively shows that combining KD and SSL can exploit zero-shot supervision from foundation models to deliver practical, resource-efficient learning without sacrificing performance.
Abstract
The high cost of data labeling presents a major barrier to deploying machine learning systems at scale. Semi-supervised learning (SSL) mitigates this challenge by utilizing unlabeled data alongside limited labeled examples, while the emergence of foundation models (FMs) offers powerful zero-shot capabilities that can further reduce labeling cost. However, directly fine-tuning large FMs is often impractical in resource-constrained settings, and naïvely using their pseudo-labels for unlabeled data can degrade performance due to its unreliablity or domain mismatch with target task. In this work, we introduce ZeroMatch, a novel SSL framework that integrates knowledge distillation with consistency-based learning to jointly leverage labeled data, unlabeled data, and pseudo-labels from FMs. ZeroMatch enables training compact student models using only FM inference, making it suitable for low-resource environments such as personal devices with limited compute. Experiments on six vision and language classification benchmarks show that ZeroMatch consistently outperforms standard SSL and zero-shot augmented methods, demonstrating its effectiveness and robustness across a range of foundation model qualities.
