Table of Contents
Fetching ...

FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients

Xinchi Qiu, Yan Gao, Lorenzo Sani, Heng Pan, Wanru Zhao, Pedro P. B. Gusmao, Mina Alibeigi, Alex Iacob, Nicholas D. Lane

TL;DR

FedAnchor tackles the challenge of learning from unlabeled edge data in federated settings by introducing a server-side labeled anchor dataset and a novel label contrastive loss applied to a dedicated anchor head. The double-head architecture enables high-quality pseudo-labels for unlabeled clients while mitigating confirmation bias and overfitting. Empirical results on CIFAR-10/100 and SVHN show FedAnchor achieving faster convergence and higher accuracy than state-of-the-art baselines, under both IID and non-IID data distributions. The method offers practical benefits for real-world FL deployments by reducing the reliance on extensive edge labeling and lowering communication overhead through efficient pseudo-labeling via anchor similarities.

Abstract

Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices while keeping data localized. The deployment of FL in numerous real-world applications faces delays, primarily due to the prevalent reliance on supervised tasks. Generating detailed labels at edge devices, if feasible, is demanding, given resource constraints and the imperative for continuous data updates. In addressing these challenges, solutions such as federated semi-supervised learning (FSSL), which relies on unlabeled clients' data and a limited amount of labeled data on the server, become pivotal. In this paper, we propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server. The anchor head is empowered with a newly designed label contrastive loss based on the cosine similarity metric. Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples. Extensive experiments on CIFAR10/100 and SVHN datasets demonstrate that our method outperforms the state-of-the-art method by a significant margin in terms of convergence rate and model accuracy.

FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients

TL;DR

FedAnchor tackles the challenge of learning from unlabeled edge data in federated settings by introducing a server-side labeled anchor dataset and a novel label contrastive loss applied to a dedicated anchor head. The double-head architecture enables high-quality pseudo-labels for unlabeled clients while mitigating confirmation bias and overfitting. Empirical results on CIFAR-10/100 and SVHN show FedAnchor achieving faster convergence and higher accuracy than state-of-the-art baselines, under both IID and non-IID data distributions. The method offers practical benefits for real-world FL deployments by reducing the reliance on extensive edge labeling and lowering communication overhead through efficient pseudo-labeling via anchor similarities.

Abstract

Federated learning (FL) is a distributed learning paradigm that facilitates collaborative training of a shared global model across devices while keeping data localized. The deployment of FL in numerous real-world applications faces delays, primarily due to the prevalent reliance on supervised tasks. Generating detailed labels at edge devices, if feasible, is demanding, given resource constraints and the imperative for continuous data updates. In addressing these challenges, solutions such as federated semi-supervised learning (FSSL), which relies on unlabeled clients' data and a limited amount of labeled data on the server, become pivotal. In this paper, we propose FedAnchor, an innovative FSSL method that introduces a unique double-head structure, called anchor head, paired with the classification head trained exclusively on labeled anchor data on the server. The anchor head is empowered with a newly designed label contrastive loss based on the cosine similarity metric. Our approach mitigates the confirmation bias and overfitting issues associated with pseudo-labeling techniques based on high-confidence model prediction samples. Extensive experiments on CIFAR10/100 and SVHN datasets demonstrate that our method outperforms the state-of-the-art method by a significant margin in terms of convergence rate and model accuracy.
Paper Structure (27 sections, 9 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 27 sections, 9 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: (left) Pipeline of FedAnchor with pseudo labeling and anchor data on the server. (right) Pseudo labeling accuracy with 5000/2500/1000 anchor data on CIFAR10/CIFAR100/SVHN datasets, respectively.
  • Figure 2: The average number of qualified data samples trained by selected clients in each round, for CIFAR10 Non-IID ($\alpha=0.1$) partition with $250$/$500$ anchor data on the server, using method SemiFL. The plot shows that the average number of data samples trained by clients is very small, given each client has a total of $500$ training data samples.
  • Figure 3: Results on CIFAR10 Non-IID with different anchor sizes. (a) and (b) show the testing accuracy with different methods on the two anchor setups. (c) shows the average number of samples selected for training on each client. (d) shows the pseudo-label accuracy of the 500 anchor case (to avoid too many lines in one graph, we show the 500 anchors case, but the general trend is the same for other cases), which shows that pseudo-labels generated by FedAnchor are consistently above the baseline method. (c) and (d) are smoothed every $5$ rounds.
  • Figure 4: Plots on CIFAR100 (IID, 2500 anchors) experiment. (a) testing accuracy. (b) pseudo-label accuracy. FedAnchor can reduce the confirmation bias compared to SemiFL.
  • Figure 5: Visualization of the latent space of the model at different stages of the FL using FedAnchor. A t-SNE dimensionality reduction has been performed to improve readability. CIFAR-10 with Wide ResNet backbone is represented here. Different colors refer to different labels. Each data point is representative of a data sample in the (centralized) test set of CIFAR-10. The stars ( ) represent the centroids for each label. The rounds represented in (a), (b), (c), and (d) are respectively $1$ (after one aggregation), $100$, $200$, and $500$ (final global model).
  • ...and 4 more figures