Table of Contents
Fetching ...

FedAPA: Server-side Gradient-Based Adaptive Personalized Aggregation for Federated Learning on Heterogeneous Data

Yuxia Sun, Aoxiang Sun, Siyi Pan, Zhixiao Fu, Jingcai Guo

TL;DR

FedAPA tackles the challenge of personalization under data heterogeneity by introducing a server-side gradient-based adaptive aggregation that learns per-client weights to combine collaborators’ parameters. Personalization is achieved via personalized parameters $\bar{\theta}_i = \sum_j a_{i,j} \theta_j$, with updates to the aggregation weights $A_i$ driven by the client-parameter delta $\Delta \theta_i$, i.e., $A_i \leftarrow A_i - \eta (\nabla_{A_i} \bar{\theta}_i)^T \Delta \theta_i$, along with post-processing to stabilize training. The authors provide convergence guarantees under standard smoothness and unbiased-gradient assumptions and demonstrate through extensive experiments on FMNIST, CIFAR-10, and CIFAR-100 that FedAPA achieves top accuracy with competitive computation and low communication overhead, particularly in practical non-IID settings. The approach offers a scalable, communication-efficient path to high-performance personalized federated learning by centrally learning adaptive weights without auxiliary networks.

Abstract

Personalized federated learning (PFL) tailors models to clients' unique data distributions while preserving privacy. However, existing aggregation-weight-based PFL methods often struggle with heterogeneous data, facing challenges in accuracy, computational efficiency, and communication overhead. We propose FedAPA, a novel PFL method featuring a server-side, gradient-based adaptive aggregation strategy to generate personalized models, by updating aggregation weights based on gradients of client-parameter changes with respect to the aggregation weights in a centralized manner. FedAPA guarantees theoretical convergence and achieves superior accuracy and computational efficiency compared to 10 PFL competitors across three datasets, with competitive communication overhead.

FedAPA: Server-side Gradient-Based Adaptive Personalized Aggregation for Federated Learning on Heterogeneous Data

TL;DR

FedAPA tackles the challenge of personalization under data heterogeneity by introducing a server-side gradient-based adaptive aggregation that learns per-client weights to combine collaborators’ parameters. Personalization is achieved via personalized parameters , with updates to the aggregation weights driven by the client-parameter delta , i.e., , along with post-processing to stabilize training. The authors provide convergence guarantees under standard smoothness and unbiased-gradient assumptions and demonstrate through extensive experiments on FMNIST, CIFAR-10, and CIFAR-100 that FedAPA achieves top accuracy with competitive computation and low communication overhead, particularly in practical non-IID settings. The approach offers a scalable, communication-efficient path to high-performance personalized federated learning by centrally learning adaptive weights without auxiliary networks.

Abstract

Personalized federated learning (PFL) tailors models to clients' unique data distributions while preserving privacy. However, existing aggregation-weight-based PFL methods often struggle with heterogeneous data, facing challenges in accuracy, computational efficiency, and communication overhead. We propose FedAPA, a novel PFL method featuring a server-side, gradient-based adaptive aggregation strategy to generate personalized models, by updating aggregation weights based on gradients of client-parameter changes with respect to the aggregation weights in a centralized manner. FedAPA guarantees theoretical convergence and achieves superior accuracy and computational efficiency compared to 10 PFL competitors across three datasets, with competitive communication overhead.

Paper Structure

This paper contains 21 sections, 29 equations, 2 figures, 4 tables, 1 algorithm.

Figures (2)

  • Figure 1: Framework of FedAPA. ① The server generates personalized model $\bar{\theta}_i$ via aggregation according to weight vector $A_i$; ② Each client downloads its model $\bar{\theta}_i$; ③ Local training on private data; ④ Each client upload its model $\theta_i$; ⑤ The server computes the update of model parameters $\triangle\theta_i$; ⑥ Updates $A_i$ via gradient descent according to $\triangle\theta_i$; ⑦ Post-processes $A_i$ via clipping, self-weight setting, and normalization.
  • Figure 2: Convergence curves of FedAPA and 11 compared methods on CIFAR-10 (20 clients, practical Non-IID setting).