Table of Contents
Fetching ...

Estimating before Debiasing: A Bayesian Approach to Detaching Prior Bias in Federated Semi-Supervised Learning

Guogang Zhu, Xuefeng Liu, Xinghao Wu, Shaojie Tang, Chao Tang, Jianwei Niu, Hao Su

TL;DR

The paper diagnoses a class-prior bias in Federated Semi-Supervised Learning arising from imbalanced labeled and unlabeled data across heterogeneous clients. It introduces FedDB, a Bayesian debiasing framework that uses Average Prediction Probability of Unlabeled Data (APP-U) to approximate the biased prior and to guide both local pseudo-labeling (Debiased Pseudo-Labeling) and global aggregation (Debiased Model Aggregation). Empirical results on CIFAR10, CIFAR100, and SVHN across IID and Non-IID settings show FedDB consistently improves over strong baselines, with DPL and DMA contributing complementary gains. The approach offers a practical, plug-in solution to reduce prior-induced bias in FSSL, supported by code availability for reproducibility and broader applicability.

Abstract

Federated Semi-Supervised Learning (FSSL) leverages both labeled and unlabeled data on clients to collaboratively train a model.In FSSL, the heterogeneous data can introduce prediction bias into the model, causing the model's prediction to skew towards some certain classes. Existing FSSL methods primarily tackle this issue by enhancing consistency in model parameters or outputs. However, as the models themselves are biased, merely constraining their consistency is not sufficient to alleviate prediction bias. In this paper, we explore this bias from a Bayesian perspective and demonstrate that it principally originates from label prior bias within the training data. Building upon this insight, we propose a debiasing method for FSSL named FedDB. FedDB utilizes the Average Prediction Probability of Unlabeled Data (APP-U) to approximate the biased prior.During local training, FedDB employs APP-U to refine pseudo-labeling through Bayes' theorem, thereby significantly reducing the label prior bias. Concurrently, during the model aggregation, FedDB uses APP-U from participating clients to formulate unbiased aggregate weights, thereby effectively diminishing bias in the global model. Experimental results show that FedDB can surpass existing FSSL methods. The code is available at https://github.com/GuogangZhu/FedDB.

Estimating before Debiasing: A Bayesian Approach to Detaching Prior Bias in Federated Semi-Supervised Learning

TL;DR

The paper diagnoses a class-prior bias in Federated Semi-Supervised Learning arising from imbalanced labeled and unlabeled data across heterogeneous clients. It introduces FedDB, a Bayesian debiasing framework that uses Average Prediction Probability of Unlabeled Data (APP-U) to approximate the biased prior and to guide both local pseudo-labeling (Debiased Pseudo-Labeling) and global aggregation (Debiased Model Aggregation). Empirical results on CIFAR10, CIFAR100, and SVHN across IID and Non-IID settings show FedDB consistently improves over strong baselines, with DPL and DMA contributing complementary gains. The approach offers a practical, plug-in solution to reduce prior-induced bias in FSSL, supported by code availability for reproducibility and broader applicability.

Abstract

Federated Semi-Supervised Learning (FSSL) leverages both labeled and unlabeled data on clients to collaboratively train a model.In FSSL, the heterogeneous data can introduce prediction bias into the model, causing the model's prediction to skew towards some certain classes. Existing FSSL methods primarily tackle this issue by enhancing consistency in model parameters or outputs. However, as the models themselves are biased, merely constraining their consistency is not sufficient to alleviate prediction bias. In this paper, we explore this bias from a Bayesian perspective and demonstrate that it principally originates from label prior bias within the training data. Building upon this insight, we propose a debiasing method for FSSL named FedDB. FedDB utilizes the Average Prediction Probability of Unlabeled Data (APP-U) to approximate the biased prior.During local training, FedDB employs APP-U to refine pseudo-labeling through Bayes' theorem, thereby significantly reducing the label prior bias. Concurrently, during the model aggregation, FedDB uses APP-U from participating clients to formulate unbiased aggregate weights, thereby effectively diminishing bias in the global model. Experimental results show that FedDB can surpass existing FSSL methods. The code is available at https://github.com/GuogangZhu/FedDB.
Paper Structure (22 sections, 17 equations, 7 figures, 4 tables, 3 algorithms)

This paper contains 22 sections, 17 equations, 7 figures, 4 tables, 3 algorithms.

Figures (7)

  • Figure 1: Class-wise test accuracy on a balanced test dataset, along with the labeled data distribution on an individual client. (a) Test accuracy of local model, (b) Test accuracy of global model. The class indexes are ranked based on the labeled data distribution.
  • Figure 2: Prior bias in class-imbalanced FSSL.
  • Figure 3: JS divergence between the ground truth bias and either the labeled data distribution or APP-U on clients. (a) Results on the local model, (b) Results on the global model.
  • Figure 4: Framework overview of FedDB.
  • Figure 5: Convergence curve on CIFAR100. (a) IID, (b)Non-IID with $\delta=0.3$.
  • ...and 2 more figures