Table of Contents
Fetching ...

FedGA: Federated Learning with Gradient Alignment for Error Asymmetry Mitigation

Chenguang Xiao, Zheming Zuo, Shuo Wang

TL;DR

FedGA addresses class-imbalance-induced bias in distributed FL by revealing and mitigating Error Asymmetry (EA), the ratio of Type I to Type II errors. It introduces gradient alignment via label calibration, producing calibrated labels that balance active and inactive gradient contributions so that EA ≈ 1 without extra privacy or communication costs. The method yields more stable convergence and higher accuracy/F1 across diverse datasets and inter-client heterogeneity, outperforming FedAvg and variants, especially under strong imbalance. This approach enhances robustness of decentralized learning in privacy-preserving settings and offers a principled, gradient-based path to bias mitigation.

Abstract

Federated learning (FL) triggers intra-client and inter-client class imbalance, with the latter compared to the former leading to biased client updates and thus deteriorating the distributed models. Such a bias is exacerbated during the server aggregation phase and has yet to be effectively addressed by conventional re-balancing methods. To this end, different from the off-the-shelf label or loss-based approaches, we propose a gradient alignment (GA)-informed FL method, dubbed as FedGA, where the importance of error asymmetry (EA) in bias is observed and its linkage to the gradient of the loss to raw logits is explored. Concretely, GA, implemented by label calibration during the model backpropagation process, prevents catastrophic forgetting of rate and missing classes, hence boosting model convergence and accuracy. Experimental results on five benchmark datasets demonstrate that GA outperforms the pioneering counterpart FedAvg and its four variants in minimizing EA and updating bias, and accordingly yielding higher F1 score and accuracy margins when the Dirichlet distribution sampling factor $α$ increases. The code and more details are available at \url{https://anonymous.4open.science/r/FedGA-B052/README.md}.

FedGA: Federated Learning with Gradient Alignment for Error Asymmetry Mitigation

TL;DR

FedGA addresses class-imbalance-induced bias in distributed FL by revealing and mitigating Error Asymmetry (EA), the ratio of Type I to Type II errors. It introduces gradient alignment via label calibration, producing calibrated labels that balance active and inactive gradient contributions so that EA ≈ 1 without extra privacy or communication costs. The method yields more stable convergence and higher accuracy/F1 across diverse datasets and inter-client heterogeneity, outperforming FedAvg and variants, especially under strong imbalance. This approach enhances robustness of decentralized learning in privacy-preserving settings and offers a principled, gradient-based path to bias mitigation.

Abstract

Federated learning (FL) triggers intra-client and inter-client class imbalance, with the latter compared to the former leading to biased client updates and thus deteriorating the distributed models. Such a bias is exacerbated during the server aggregation phase and has yet to be effectively addressed by conventional re-balancing methods. To this end, different from the off-the-shelf label or loss-based approaches, we propose a gradient alignment (GA)-informed FL method, dubbed as FedGA, where the importance of error asymmetry (EA) in bias is observed and its linkage to the gradient of the loss to raw logits is explored. Concretely, GA, implemented by label calibration during the model backpropagation process, prevents catastrophic forgetting of rate and missing classes, hence boosting model convergence and accuracy. Experimental results on five benchmark datasets demonstrate that GA outperforms the pioneering counterpart FedAvg and its four variants in minimizing EA and updating bias, and accordingly yielding higher F1 score and accuracy margins when the Dirichlet distribution sampling factor increases. The code and more details are available at \url{https://anonymous.4open.science/r/FedGA-B052/README.md}.

Paper Structure

This paper contains 7 sections, 10 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: We propose an FL framework for error asymmetry mitigation caused by class imbalance. Our gradient alignment (GA) method, different from binary cross entropy (BCE) adopted in the prestigious FedAvg, scales active and inactive gradients through the label calibration to guide the model backpropagation process, thus reducing client update bias and allaying privacy and communication concerns.
  • Figure 2: Class accuracy before and after local training with classical FedAvg on inter-client imbalanced MNIST. Each subplot corresponds to a random active client, showing sharp accuracy drops for rare and missing classes post-training.
  • Figure 3: Mean error ratio between Type I to Type II error on an imbalanced binary ( e.g. class '6' and '0') subset of the MNIST dataset with imbalance ratio $r$ valued as 10 and 100. PP and PN represent predicted positive and negative, respectively. LA denotes layerwise attention. Noteworthy, FedGA (green) is equivalent to FedAvg (blue) when $r$ equals 1. Best viewed in color and zoomed mode.
  • Figure 4: Comparative accuracy of FedAvg and FedGA on CIFAR-10 with four inter-client imbalance levels. FedGA converges much faster than FedAvg at each $\alpha$.
  • Figure 5: The average EA ratio of ten active clients on inter-client imbalance CIFAR-10 during the training phase.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Definition 1