Domain Adaptation Using Pseudo Labels
Sachin Chhabra, Hemanth Venkateswara, Baoxin Li
TL;DR
This paper tackles unsupervised domain adaptation by addressing category misalignment that arises from marginal distribution alignment. It introduces DAPL, a simple pipeline that generates target pseudo labels via a Gaussian Mixture-based feature space, then progressively refines and filters them through Confidence, Conformity, and Consistency criteria before using them as supervision to adapt a source classifier. A per-epoch target supervision schedule and a time-varying loss weight enable gradual domain adaptation, yielding competitive results across Digits, VisDA, and Office-Home benchmarks while avoiding heavy domain-alignment losses. The findings show that high-quality pseudo labels, obtained through principled filtering, can match or exceed outcomes from more complex domain-alignment techniques, with practical impact for robust, data-efficient domain adaptation. The approach also highlights the importance of limiting confirmation bias and suggests avenues for future refinement when target data are scarce.
Abstract
In the absence of labeled target data, unsupervised domain adaptation approaches seek to align the marginal distributions of the source and target domains in order to train a classifier for the target. Unsupervised domain alignment procedures are category-agnostic and end up misaligning the categories. We address this problem by deploying a pretrained network to determine accurate labels for the target domain using a multi-stage pseudo-label refinement procedure. The filters are based on the confidence, distance (conformity), and consistency of the pseudo labels. Our results on multiple datasets demonstrate the effectiveness of our simple procedure in comparison with complex state-of-the-art techniques.
