Table of Contents
Fetching ...

Personalized federated learning based on feature fusion

Wolong Xing, Zhenkui Shi, Hongyan Peng, Xiantao Hu, Xianxian Li

TL;DR

This work tackles federated learning under label distribution skew, data/model heterogeneity, and communication constraints by introducing pFedPM, a feature-fusion-based personalized FL method. Clients share class-wise feature representations rather than gradients, using a shared body and two heads, with a hyperparameter $a$ to balance local versus global features and a relation network to compute label correlations. A convergence analysis under standard non-convex assumptions supports the method's theoretical soundness, while experiments on MNIST, FEMNIST, and CIFAR10 show improved accuracy and significantly reduced communication compared with several baselines. The approach offers a practical, privacy-preserving means to personalize FL in heterogeneous environments and can adapt to missing or imbalanced labels while lowering communication cost.

Abstract

Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy. However, due to the heterogeneity of data, models, and devices, the final global model may need to perform better for tasks on each client. Communication bottlenecks, data heterogeneity, and model heterogeneity have been common challenges in federated learning. In this work, we considered a label distribution skew problem, a type of data heterogeneity easily overlooked. In the context of classification, we propose a personalized federated learning approach called pFedPM. In our process, we replace traditional gradient uploading with feature uploading, which helps reduce communication costs and allows for heterogeneous client models. These feature representations play a role in preserving privacy to some extent. We use a hyperparameter $a$ to mix local and global features, which enables us to control the degree of personalization. We also introduced a relation network as an additional decision layer, which provides a non-linear learnable classifier to predict labels. Experimental results show that, with an appropriate setting of $a$, our scheme outperforms several recent FL methods on MNIST, FEMNIST, and CRIFAR10 datasets and achieves fewer communications.

Personalized federated learning based on feature fusion

TL;DR

This work tackles federated learning under label distribution skew, data/model heterogeneity, and communication constraints by introducing pFedPM, a feature-fusion-based personalized FL method. Clients share class-wise feature representations rather than gradients, using a shared body and two heads, with a hyperparameter to balance local versus global features and a relation network to compute label correlations. A convergence analysis under standard non-convex assumptions supports the method's theoretical soundness, while experiments on MNIST, FEMNIST, and CIFAR10 show improved accuracy and significantly reduced communication compared with several baselines. The approach offers a practical, privacy-preserving means to personalize FL in heterogeneous environments and can adapt to missing or imbalanced labels while lowering communication cost.

Abstract

Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy. However, due to the heterogeneity of data, models, and devices, the final global model may need to perform better for tasks on each client. Communication bottlenecks, data heterogeneity, and model heterogeneity have been common challenges in federated learning. In this work, we considered a label distribution skew problem, a type of data heterogeneity easily overlooked. In the context of classification, we propose a personalized federated learning approach called pFedPM. In our process, we replace traditional gradient uploading with feature uploading, which helps reduce communication costs and allows for heterogeneous client models. These feature representations play a role in preserving privacy to some extent. We use a hyperparameter to mix local and global features, which enables us to control the degree of personalization. We also introduced a relation network as an additional decision layer, which provides a non-linear learnable classifier to predict labels. Experimental results show that, with an appropriate setting of , our scheme outperforms several recent FL methods on MNIST, FEMNIST, and CRIFAR10 datasets and achieves fewer communications.
Paper Structure (12 sections, 2 theorems, 22 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 12 sections, 2 theorems, 22 equations, 3 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

(One-round deviation) Let Assumption assumption1 to assumption4 hold. For an arbitrary client, after every communication round, we have,

Figures (3)

  • Figure 1: For example, in the heterogeneous environment of pFedPM, the $i$-th client only possesses data with labels 4 and 5. Firstly, the client updates its first-layer model (feature extraction module and decision module) by minimizing the classification loss $\ell_S$ and the distance loss between mixed features and local features $\ell_R$. Next, the feature extraction module is fixed, and the relation module is updated by concatenating mixed features to minimize the loss function $\ell_{MSE}$.
  • Figure 2: We input an image with a handwritten digit nine into the feature extraction module, obtaining the features of the picture. Since we know this is a ten-classification problem, we first duplicate these ten features. Then, we concatenate them with mixed features corresponding to different labels. These connected features are input into the relation module to obtain correlation scores. Finally, we use a softmax function to get the predicted labels for each feature.
  • Figure 3: On the FEMNIST dataset, the accuracy corresponding to different hyperparameters $a$

Theorems & Definitions (2)

  • Theorem 1
  • Corollary 1