Personalized federated learning based on feature fusion
Wolong Xing, Zhenkui Shi, Hongyan Peng, Xiantao Hu, Xianxian Li
TL;DR
This work tackles federated learning under label distribution skew, data/model heterogeneity, and communication constraints by introducing pFedPM, a feature-fusion-based personalized FL method. Clients share class-wise feature representations rather than gradients, using a shared body and two heads, with a hyperparameter $a$ to balance local versus global features and a relation network to compute label correlations. A convergence analysis under standard non-convex assumptions supports the method's theoretical soundness, while experiments on MNIST, FEMNIST, and CIFAR10 show improved accuracy and significantly reduced communication compared with several baselines. The approach offers a practical, privacy-preserving means to personalize FL in heterogeneous environments and can adapt to missing or imbalanced labels while lowering communication cost.
Abstract
Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy. However, due to the heterogeneity of data, models, and devices, the final global model may need to perform better for tasks on each client. Communication bottlenecks, data heterogeneity, and model heterogeneity have been common challenges in federated learning. In this work, we considered a label distribution skew problem, a type of data heterogeneity easily overlooked. In the context of classification, we propose a personalized federated learning approach called pFedPM. In our process, we replace traditional gradient uploading with feature uploading, which helps reduce communication costs and allows for heterogeneous client models. These feature representations play a role in preserving privacy to some extent. We use a hyperparameter $a$ to mix local and global features, which enables us to control the degree of personalization. We also introduced a relation network as an additional decision layer, which provides a non-linear learnable classifier to predict labels. Experimental results show that, with an appropriate setting of $a$, our scheme outperforms several recent FL methods on MNIST, FEMNIST, and CRIFAR10 datasets and achieves fewer communications.
