Meta Co-Training: Two Views are Better than One
Jay C. Rothenberger, Dimitrios I. Diochnos
TL;DR
Meta Co-Training addresses the challenge of semi-supervised learning when true multi-view data is unavailable by constructing two complementary views from pre-trained representations and optimizing a bi-level student-teacher objective. The method jointly leverages pseudo-labels across views while using labeled data to supervise and refine the teacher, resulting in robust improvements over traditional co-training, especially when view content differs significantly. Empirical results on ImageNet-10% establish new state-of-the-art performance, with additional gains on Flowers102, Food101, FGVC Aircraft, and iNaturalist datasets, demonstrating MCT’s robustness to view imbalance and its advantage over deep ensembles in many settings. The findings highlight the practical value of using pre-trained foundation-model embeddings as interchangeable views to unlock effective semi-supervised learning without extensive retraining.
Abstract
In many critical computer vision scenarios unlabeled data is plentiful, but labels are scarce and difficult to obtain. As a result, semi-supervised learning which leverages unlabeled data to boost the performance of supervised classifiers have received significant attention in recent literature. One representative class of semi-supervised algorithms are co-training algorithms. Co-training algorithms leverage two different models which have access to different independent and sufficient representations or "views" of the data to jointly make better predictions. Each of these models creates pseudo-labels on unlabeled points which are used to improve the other model. We show that in the common case where independent views are not available, we can construct such views inexpensively using pre-trained models. Co-training on the constructed views yields a performance improvement over any of the individual views we construct and performance comparable with recent approaches in semi-supervised learning. We present Meta Co-Training, a novel semi-supervised learning algorithm, which has two advantages over co-training: (i) learning is more robust when there is large discrepancy between the information content of the different views, and (ii) does not require retraining from scratch on each iteration. Our method achieves new state-of-the-art performance on ImageNet-10% achieving a ~4.7% reduction in error rate over prior work. Our method also outperforms prior semi-supervised work on several other fine-grained image classification datasets.
