FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients
Xinchi Qiu, Yan Gao, Lorenzo Sani, Heng Pan, Wanru Zhao, Pedro P. B. Gusmao, Mina Alibeigi, Alex Iacob, Nicholas D. Lane
TL;DR
FedAnchor tackles the challenge of learning from unlabeled edge data in federated settings by introducing a server-side labeled anchor dataset and a novel label contrastive loss applied to a dedicated anchor head. The double-head architecture enables high-quality pseudo-labels for unlabeled clients while mitigating confirmation bias and overfitting. Empirical results on CIFAR-10/100 and SVHN show FedAnchor achieving faster convergence and higher accuracy than state-of-the-art baselines, under both IID and non-IID data distributions. The method offers practical benefits for real-world FL deployments by reducing the reliance on extensive edge labeling and lowering communication overhead through efficient pseudo-labeling via anchor similarities.
Abstract
Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices while keeping data localized. The deployment of FL in numerous real-world applications faces delays, primarily due to the prevalent reliance on supervised tasks. Generating detailed labels at edge devices, if feasible, is demanding, given resource constraints and the imperative for continuous data updates. In addressing these challenges, solutions such as federated semi-supervised learning (FSSL), which relies on unlabeled clients' data and a limited amount of labeled data on the server, become pivotal. In this paper, we propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server. The anchor head is empowered with a newly designed label contrastive loss based on the cosine similarity metric. Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples. Extensive experiments on CIFAR10/100 and SVHN datasets demonstrate that our method outperforms the state-of-the-art method by a significant margin in terms of convergence rate and model accuracy.
