Table of Contents
Fetching ...

SpreadFGL: Edge-Client Collaborative Federated Graph Learning with Adaptive Neighbor Generation

Luying Zhong, Yueyang Pi, Zheyi Chen, Zhengxin Yu, Wang Miao, Xing Chen, Geyong Min

TL;DR

SpreadFGL tackles missing inter-client topology and edge-server overload in Federated Graph Learning by introducing an adaptive graph imputation generator and a versatile assessor to uncover generalized cross-subgraph links without exposing raw data. It defines FedGL as a centralized baseline and extends it to a multi-edge SpreadFGL with distributed training, where edge servers share globally inferred topology and perform joint model updates. The framework leverages an autoencoder-driven latent representation, negative sampling, and graph fixing to repair cross-subgraph connections, improving feature propagation and downstream accuracy. Empirical results on real-world testbeds and benchmarks show SpreadFGL achieving higher accuracy and faster convergence than state-of-the-art FGL methods, with ablations confirming the synergy of its components.

Abstract

Federated Graph Learning (FGL) has garnered widespread attention by enabling collaborative training on multiple clients for semi-supervised classification tasks. However, most existing FGL studies do not well consider the missing inter-client topology information in real-world scenarios, causing insufficient feature aggregation of multi-hop neighbor clients during model training. Moreover, the classic FGL commonly adopts the FedAvg but neglects the high training costs when the number of clients expands, resulting in the overload of a single edge server. To address these important challenges, we propose a novel FGL framework, named SpreadFGL, to promote the information flow in edge-client collaboration and extract more generalized potential relationships between clients. In SpreadFGL, an adaptive graph imputation generator incorporated with a versatile assessor is first designed to exploit the potential links between subgraphs, without sharing raw data. Next, a new negative sampling mechanism is developed to make SpreadFGL concentrate on more refined information in downstream tasks. To facilitate load balancing at the edge layer, SpreadFGL follows a distributed training manner that enables fast model convergence. Using real-world testbed and benchmark graph datasets, extensive experiments demonstrate the effectiveness of the proposed SpreadFGL. The results show that SpreadFGL achieves higher accuracy and faster convergence against state-of-the-art algorithms.

SpreadFGL: Edge-Client Collaborative Federated Graph Learning with Adaptive Neighbor Generation

TL;DR

SpreadFGL tackles missing inter-client topology and edge-server overload in Federated Graph Learning by introducing an adaptive graph imputation generator and a versatile assessor to uncover generalized cross-subgraph links without exposing raw data. It defines FedGL as a centralized baseline and extends it to a multi-edge SpreadFGL with distributed training, where edge servers share globally inferred topology and perform joint model updates. The framework leverages an autoencoder-driven latent representation, negative sampling, and graph fixing to repair cross-subgraph connections, improving feature propagation and downstream accuracy. Empirical results on real-world testbeds and benchmarks show SpreadFGL achieving higher accuracy and faster convergence than state-of-the-art FGL methods, with ablations confirming the synergy of its components.

Abstract

Federated Graph Learning (FGL) has garnered widespread attention by enabling collaborative training on multiple clients for semi-supervised classification tasks. However, most existing FGL studies do not well consider the missing inter-client topology information in real-world scenarios, causing insufficient feature aggregation of multi-hop neighbor clients during model training. Moreover, the classic FGL commonly adopts the FedAvg but neglects the high training costs when the number of clients expands, resulting in the overload of a single edge server. To address these important challenges, we propose a novel FGL framework, named SpreadFGL, to promote the information flow in edge-client collaboration and extract more generalized potential relationships between clients. In SpreadFGL, an adaptive graph imputation generator incorporated with a versatile assessor is first designed to exploit the potential links between subgraphs, without sharing raw data. Next, a new negative sampling mechanism is developed to make SpreadFGL concentrate on more refined information in downstream tasks. To facilitate load balancing at the edge layer, SpreadFGL follows a distributed training manner that enables fast model convergence. Using real-world testbed and benchmark graph datasets, extensive experiments demonstrate the effectiveness of the proposed SpreadFGL. The results show that SpreadFGL achieves higher accuracy and faster convergence against state-of-the-art algorithms.
Paper Structure (14 sections, 16 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 14 sections, 16 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: Comparison between the classic FGL and the FedGL designed in the proposed SpreadFGL. In Fig. \ref{['fig:figure_1']} (left), the FGL scardapane2020distributed does not consider the inter-links between clients, causing insufficient feature propagation of multi-hop neighbors. In Fig. \ref{['fig:figure_1']} (middle), the FGL zhang2021subgraph solely infers the missing links by local subgraphs but ignores the meaningful information in neighbor clients. In Fig. \ref{['fig:figure_1']} (right), the proposed FedGL utilizes the globally-shared information among clients, thereby extracting important cross-subgraph links for classification tasks.
  • Figure 2: Overview of the proposed SpreadFGL. The SpreadFGL targets a distributed scenario that consists of multiple edge servers and clients. At the edge layer, the autoencoder is employed to explore potential global features of the covered clients, and then the versatile assessor is combined with a negative sampling mechanism to supervise refined information, where model parameters transmission is permitted between neighbor edge servers. At the client layer, GNNs are used as local node classifiers for downstream tasks, and then graphic patchers are employed to repair subgraphs and missing cross-subgraph links.
  • Figure 3: Real-world testbed for FedGL and SpreadFG.
  • Figure 4: ACC of SpreadFGL with various numbers of clients and labeled ratios.
  • Figure 5: Accuracy of SpreadFGL with different values of $\mathcal{K}$.
  • ...and 4 more figures