Table of Contents
Fetching ...

FedAGHN: Personalized Federated Learning with Attentive Graph HyperNetworks

Jiarui Song, Yunheng Shen, Chengbin Hou, Pengyu Wang, Jinbao Wang, Ke Tang, Hairong Lv

TL;DR

FedAGHN tackles statistical heterogeneity in Federated Learning by learning client-specific, layer-wise collaboration graphs via Attentive Graph HyperNetworks to generate personalized initial models for each client. It introduces two trainable scalars per layer to adapt collaboration patterns and employs cosine-based priors on previous updates to compute attention weights, enabling end-to-end optimization of personalized aggregation. Empirical results across CIFAR-10/100 and Tiny-ImageNet under various non-IID settings demonstrate state-of-the-art performance, with analyses and visualizations confirming the learned graphs reflect data distribution similarities and dynamic collaboration across rounds. The approach yields a lightweight, scalable server-side mechanism for fine-grained personalization with practical implications for real-world FL deployments.

Abstract

Personalized Federated Learning (PFL) aims to address the statistical heterogeneity of data across clients by learning the personalized model for each client. Among various PFL approaches, the personalized aggregation-based approach conducts parameter aggregation in the server-side aggregation phase to generate personalized models, and focuses on learning appropriate collaborative relationships among clients for aggregation. However, the collaborative relationships vary in different scenarios and even at different stages of the FL process. To this end, we propose Personalized Federated Learning with Attentive Graph HyperNetworks (FedAGHN), which employs Attentive Graph HyperNetworks (AGHNs) to dynamically capture fine-grained collaborative relationships and generate client-specific personalized initial models. Specifically, AGHNs empower graphs to explicitly model the client-specific collaborative relationships, construct collaboration graphs, and introduce tunable attentive mechanism to derive the collaboration weights, so that the personalized initial models can be obtained by aggregating parameters over the collaboration graphs. Extensive experiments can demonstrate the superiority of FedAGHN. Moreover, a series of visualizations are presented to explore the effectiveness of collaboration graphs learned by FedAGHN.

FedAGHN: Personalized Federated Learning with Attentive Graph HyperNetworks

TL;DR

FedAGHN tackles statistical heterogeneity in Federated Learning by learning client-specific, layer-wise collaboration graphs via Attentive Graph HyperNetworks to generate personalized initial models for each client. It introduces two trainable scalars per layer to adapt collaboration patterns and employs cosine-based priors on previous updates to compute attention weights, enabling end-to-end optimization of personalized aggregation. Empirical results across CIFAR-10/100 and Tiny-ImageNet under various non-IID settings demonstrate state-of-the-art performance, with analyses and visualizations confirming the learned graphs reflect data distribution similarities and dynamic collaboration across rounds. The approach yields a lightweight, scalable server-side mechanism for fine-grained personalization with practical implications for real-world FL deployments.

Abstract

Personalized Federated Learning (PFL) aims to address the statistical heterogeneity of data across clients by learning the personalized model for each client. Among various PFL approaches, the personalized aggregation-based approach conducts parameter aggregation in the server-side aggregation phase to generate personalized models, and focuses on learning appropriate collaborative relationships among clients for aggregation. However, the collaborative relationships vary in different scenarios and even at different stages of the FL process. To this end, we propose Personalized Federated Learning with Attentive Graph HyperNetworks (FedAGHN), which employs Attentive Graph HyperNetworks (AGHNs) to dynamically capture fine-grained collaborative relationships and generate client-specific personalized initial models. Specifically, AGHNs empower graphs to explicitly model the client-specific collaborative relationships, construct collaboration graphs, and introduce tunable attentive mechanism to derive the collaboration weights, so that the personalized initial models can be obtained by aggregating parameters over the collaboration graphs. Extensive experiments can demonstrate the superiority of FedAGHN. Moreover, a series of visualizations are presented to explore the effectiveness of collaboration graphs learned by FedAGHN.

Paper Structure

This paper contains 32 sections, 10 equations, 8 figures, 10 tables, 2 algorithms.

Figures (8)

  • Figure 1: Comparison of previous works and our approach in modeling collaborative relationships. A darker node color indicates a higher collaborative weight of the corresponding edge. Our FedAGHN can adaptively tune the attentive weights among clients for layer-wise personalized aggregation.
  • Figure 2: The overall framework of FedAGHN. ① Server side collects $\Delta\Theta^{(t)}=\{\Delta\theta_1^{(t)},\Delta\theta_2^{(t)},\dots,\Delta\theta_N^{(t)}\}$, where the update of personalized local model parameters $\Delta\theta_i^{(t)}$ is sent from client $i$. ② $\text{AGHN}_i$ takes $\Delta\Theta^{(t)}$ and $\Theta^{(t)}$ as input, where $\Theta^{(t)}=\bar{\Theta}^{(t)}+\Delta\Theta^{(t)}$ is calculated at server. ③ $\text{AGHN}_i$ updates trainable parameters, generates collaboration graphs $G_i$ with attentive weights, and outputs the personalized initial model $\bar{\theta}_i^{(t+1)}$ as detailed in Section \ref{['AGHN']}. ④ $\bar{\theta}_i^{(t+1)}$ is sent to client $i$ in order to initialize the local model. ⑤ Client side executes local training to adapt for the private local data and obtain personalized local model $\theta_i^{(t+1)}$.
  • Figure 3: The illustration of the attentive graph hypernetwork at client $i$. $\text{AGHN}_i$ takes $\Theta^{(t)}$ and ${\Delta\Theta}^{(t)}$ as inputs and sequentially performs the following steps: (1) initializing collaboration graphs, (2) calculating attentive weights, and (3) aggregating parameters over graphs. Finally, $\text{AGHN}_i$ outputs the personalized initialized model $\bar{\theta}^{(t+1)}_i$.
  • Figure 4: Visualization of data distribution with different degrees of statistical heterogeneity among 20 clients on CIFAR10, where the size of dots denotes the number of samples per class distributed to each client.
  • Figure 5:
  • ...and 3 more figures