Table of Contents
Fetching ...

AdaFGL: A New Paradigm for Federated Node Classification with Topology Heterogeneity

Xunkai Li, Zhengyu Wu, Wentao Zhang, Henan Sun, Rong-Hua Li, Guoren Wang

TL;DR

AdaFGL addresses topology heterogeneity in Federated Graph Learning by introducing structure Non-iid split and a decoupled two-step paradigm. It first learns a client-shared federated knowledge extractor via standard federated training, then performs local personalized propagation with both homophily and heterophily modules and an adaptive fusion guided by a Homophily Confidence Score. The approach yields state-of-the-art performance across 12 datasets, with notable gains under both community split and structure Non-iid split, while reducing communication and privacy risks. This work provides a practical benchmark and a flexible framework for real-world FGL deployment under diverse topologies and client conditions.

Abstract

Recently, Federated Graph Learning (FGL) has attracted significant attention as a distributed framework based on graph neural networks, primarily due to its capability to break data silos. Existing FGL studies employ community split on the homophilous global graph by default to simulate federated semi-supervised node classification settings. Such a strategy assumes the consistency of topology between the multi-client subgraphs and the global graph, where connected nodes are highly likely to possess similar feature distributions and the same label. However, in real-world implementations, the varying perspectives of local data engineering result in various subgraph topologies, posing unique heterogeneity challenges in FGL. Unlike the well-known label Non-independent identical distribution (Non-iid) problems in federated learning, FGL heterogeneity essentially reveals the topological divergence among multiple clients, namely homophily or heterophily. To simulate and handle this unique challenge, we introduce the concept of structure Non-iid split and then present a new paradigm called \underline{Ada}ptive \underline{F}ederated \underline{G}raph \underline{L}earning (AdaFGL), a decoupled two-step personalized approach. To begin with, AdaFGL employs standard multi-client federated collaborative training to acquire the federated knowledge extractor by aggregating uploaded models in the final round at the server. Then, each client conducts personalized training based on the local subgraph and the federated knowledge extractor. Extensive experiments on the 12 graph benchmark datasets validate the superior performance of AdaFGL over state-of-the-art baselines. Specifically, in terms of test accuracy, our proposed AdaFGL outperforms baselines by significant margins of 3.24\% and 5.57\% on community split and structure Non-iid split, respectively.

AdaFGL: A New Paradigm for Federated Node Classification with Topology Heterogeneity

TL;DR

AdaFGL addresses topology heterogeneity in Federated Graph Learning by introducing structure Non-iid split and a decoupled two-step paradigm. It first learns a client-shared federated knowledge extractor via standard federated training, then performs local personalized propagation with both homophily and heterophily modules and an adaptive fusion guided by a Homophily Confidence Score. The approach yields state-of-the-art performance across 12 datasets, with notable gains under both community split and structure Non-iid split, while reducing communication and privacy risks. This work provides a practical benchmark and a flexible framework for real-world FGL deployment under diverse topologies and client conditions.

Abstract

Recently, Federated Graph Learning (FGL) has attracted significant attention as a distributed framework based on graph neural networks, primarily due to its capability to break data silos. Existing FGL studies employ community split on the homophilous global graph by default to simulate federated semi-supervised node classification settings. Such a strategy assumes the consistency of topology between the multi-client subgraphs and the global graph, where connected nodes are highly likely to possess similar feature distributions and the same label. However, in real-world implementations, the varying perspectives of local data engineering result in various subgraph topologies, posing unique heterogeneity challenges in FGL. Unlike the well-known label Non-independent identical distribution (Non-iid) problems in federated learning, FGL heterogeneity essentially reveals the topological divergence among multiple clients, namely homophily or heterophily. To simulate and handle this unique challenge, we introduce the concept of structure Non-iid split and then present a new paradigm called \underline{Ada}ptive \underline{F}ederated \underline{G}raph \underline{L}earning (AdaFGL), a decoupled two-step personalized approach. To begin with, AdaFGL employs standard multi-client federated collaborative training to acquire the federated knowledge extractor by aggregating uploaded models in the final round at the server. Then, each client conducts personalized training based on the local subgraph and the federated knowledge extractor. Extensive experiments on the 12 graph benchmark datasets validate the superior performance of AdaFGL over state-of-the-art baselines. Specifically, in terms of test accuracy, our proposed AdaFGL outperforms baselines by significant margins of 3.24\% and 5.57\% on community split and structure Non-iid split, respectively.
Paper Structure (20 sections, 1 theorem, 17 equations, 11 figures, 8 tables, 2 algorithms)

This paper contains 20 sections, 1 theorem, 17 equations, 11 figures, 8 tables, 2 algorithms.

Key Result

Proposition 1

Among multiple clients of FGL, topological homophily attracts both the global model and optima, while topological heterophily diverges the global model and optima.

Figures (11)

  • Figure 1: Overview of standard FGL pipeline with two data simulation strategies in the same homophilous global graph. The different colors of the nodes represent the different labels.
  • Figure 2: The empirical analysis is based on the Cora with 10 client community split and structure Non-iid split. (a) the color from white to blue represents the number of nodes held by different clients in each class. (b) quantifying the topology of multi-client subgraphs from both node and edge perspectives, where higher values indicate stronger structural homophily. Please refer to Sec. \ref{['sec: Preliminaries']} for detailed computation of the metrics. (c) The x-axis of the line plot represents the federated training round. (d) The x-axis of the bar plot represents the client ID.
  • Figure 3: A toy example illustrating the impact of topology heterogeneity on FGL with the homophilous global graph.
  • Figure 4: Overview of the AdaFGL. Step 1: standard federated collaborative training, where the federated knowledge extractor parameterized by $\widetilde{\mathbf{W}}^{T+1}$ is obtained by aggregating the last training round's uploaded models at the server and broadcasted to each client. Then, each client employs the federated knowledge extractor to optimize the local topology and obtain the optimized probability propagation matrix $\widetilde{\mathbf{P}}$. Step 2: each client executes the homophilous/heterophilous propagation module $f_{Ho}$ and $f_{He}$. Subsequently, each client adaptively combines the results of the above two modules using the HCS to obtain the final predictions. More training and design details can be found in Sec. \ref{['sec: adaptive federated graph learning']}.
  • Figure 5: Predictive performance under different topology heterogeneity.
  • ...and 6 more figures

Theorems & Definitions (3)

  • Proposition 1
  • Definition 1
  • Definition 2