Table of Contents
Fetching ...

Exploring Vacant Classes in Label-Skewed Federated Learning

Kuangpu Guo, Yuhe Ding, Jian Liang, Ran He, Zilei Wang, Tieniu Tan

TL;DR

This work tackles label-skewed federated learning by addressing vacant classes and minority-class misclassification. It introduces FedVLS, which combines vacant-class distillation from the global model with logit suppression to regularize non-label logits, yielding superior performance across multiple datasets and skew degrees. The method demonstrates robust convergence and substantial gains over state-of-the-art baselines, and ablation studies validate the contribution of each component. The approach is practical, scalable, and complementary to domain-shift methods, offering meaningful impact for real-world FL with heterogeneous client distributions.

Abstract

Label skews, characterized by disparities in local label distribution across clients, pose a significant challenge in federated learning. As minority classes suffer from worse accuracy due to overfitting on local imbalanced data, prior methods often incorporate class-balanced learning techniques during local training. Although these methods improve the mean accuracy across all classes, we observe that vacant classes-referring to categories absent from a client's data distribution-remain poorly recognized. Besides, there is still a gap in the accuracy of local models on minority classes compared to the global model. This paper introduces FedVLS, a novel approach to label-skewed federated learning that integrates both vacant-class distillation and logit suppression simultaneously. Specifically, vacant-class distillation leverages knowledge distillation during local training on each client to retain essential information related to vacant classes from the global model. Moreover, logit suppression directly penalizes network logits for non-label classes, effectively addressing misclassifications in minority classes that may be biased toward majority classes. Extensive experiments validate the efficacy of FedVLS, demonstrating superior performance compared to previous state-of-the-art (SOTA) methods across diverse datasets with varying degrees of label skews. Our code is available at https://github.com/krumpguo/FedVLS.

Exploring Vacant Classes in Label-Skewed Federated Learning

TL;DR

This work tackles label-skewed federated learning by addressing vacant classes and minority-class misclassification. It introduces FedVLS, which combines vacant-class distillation from the global model with logit suppression to regularize non-label logits, yielding superior performance across multiple datasets and skew degrees. The method demonstrates robust convergence and substantial gains over state-of-the-art baselines, and ablation studies validate the contribution of each component. The approach is practical, scalable, and complementary to domain-shift methods, offering meaningful impact for real-world FL with heterogeneous client distributions.

Abstract

Label skews, characterized by disparities in local label distribution across clients, pose a significant challenge in federated learning. As minority classes suffer from worse accuracy due to overfitting on local imbalanced data, prior methods often incorporate class-balanced learning techniques during local training. Although these methods improve the mean accuracy across all classes, we observe that vacant classes-referring to categories absent from a client's data distribution-remain poorly recognized. Besides, there is still a gap in the accuracy of local models on minority classes compared to the global model. This paper introduces FedVLS, a novel approach to label-skewed federated learning that integrates both vacant-class distillation and logit suppression simultaneously. Specifically, vacant-class distillation leverages knowledge distillation during local training on each client to retain essential information related to vacant classes from the global model. Moreover, logit suppression directly penalizes network logits for non-label classes, effectively addressing misclassifications in minority classes that may be biased toward majority classes. Extensive experiments validate the efficacy of FedVLS, demonstrating superior performance compared to previous state-of-the-art (SOTA) methods across diverse datasets with varying degrees of label skews. Our code is available at https://github.com/krumpguo/FedVLS.
Paper Structure (31 sections, 11 equations, 9 figures, 17 tables, 1 algorithm)

This paper contains 31 sections, 11 equations, 9 figures, 17 tables, 1 algorithm.

Figures (9)

  • Figure 1: Class-wise accuracy of the initial global model and updated local modelS on IID and label-skewed CIFAR10 data distributions. (a) represents the result updating on IID local data with FedAvg mcmahan2017communication. (b-d) showcase the results updating on skewed data distribution with FedAvg mcmahan2017communication, FedLC zhang2022federated, and our FedVLS, respectively. The value (%) in each caption corresponds to the accuracy of the global model aggregated from local models.
  • Figure 2: Confusion matrix of client 3 on CIFAR10 dataset with Dirichlet-based label skews ($\beta$ = 0.5) using FedLC zhang2022federated.
  • Figure 3: The test accuracy over each communication round during training for different levels of Dirichlet-based label skews ($\beta \in \{0.1, 0.05\}$) on CIFAR10 and CIFAR100 datasets.
  • Figure 4: Sensitivity analysis on the client participating rates $\mathbf{R}$, local epochs $\mathbf{E}$, and client numbers $\mathbf{N}$.
  • Figure 5: Class-wise accuracy of the initial global model and updated local model on IID and label-skewed CIFAR10 data distributions. (a) represents the result updating on IID local data with FedAvg mcmahan2017communication. (b-d) showcase the results updating on skewed local data distribution with FedAvg, FedLC zhang2022federated, and our FedVLS, respectively. The value (%) in each caption corresponds to the accuracy of the global model aggregated from updated local models.
  • ...and 4 more figures