Personalized Federated Learning on Heterogeneous and Long-Tailed Data via Expert Collaborative Learning
Fengling Lv, Xinyi Shang, Yang Zhou, Yiqun Zhang, Mengke Li, Yang Lu
TL;DR
The paper tackles the challenge of personalized federated learning under joint data heterogeneity and global long-tailed distributions. It introduces Expert Collaborative Learning (ECL), which assigns multiple experts to each client, each trained on a distinct subset of classes, while reusing the global backbone to maintain robust feature representation. In Phase I, a global model is trained via FedAvg; in Phase II, the global classifier is balanced with $L_{BSCE}$, experts are trained on their subsets, and logits are aggregated through a weighted combination with the global logits using a mixing parameter $\\lambda$. Across CIFAR-LT, FashionMNIST-LT, and tiny-ImageNet200-LT, ECL consistently outperforms state-of-the-art PFL approaches, especially on tail classes, demonstrating robustness to data heterogeneity and unknown global distributions while preserving privacy.
Abstract
Personalized Federated Learning (PFL) aims to acquire customized models for each client without disclosing raw data by leveraging the collective knowledge of distributed clients. However, the data collected in real-world scenarios is likely to follow a long-tailed distribution. For example, in the medical domain, it is more common for the number of general health notes to be much larger than those specifically relatedto certain diseases. The presence of long-tailed data can significantly degrade the performance of PFL models. Additionally, due to the diverse environments in which each client operates, data heterogeneity is also a classic challenge in federated learning. In this paper, we explore the joint problem of global long-tailed distribution and data heterogeneity in PFL and propose a method called Expert Collaborative Learning (ECL) to tackle this problem. Specifically, each client has multiple experts, and each expert has a different training subset, which ensures that each class, especially the minority classes, receives sufficient training. Multiple experts collaborate synergistically to produce the final prediction output. Without special bells and whistles, the vanilla ECL outperforms other state-of-the-art PFL methods on several benchmark datasets under different degrees of data heterogeneity and long-tailed distribution.
