Table of Contents
Fetching ...

Enhancing Federated Graph Learning via Adaptive Fusion of Structural and Node Characteristics

Xianjun Gao, Jianchun Liu, Hongli Xu, Shilong Wang, Liusheng Huang

TL;DR

FedGCF tackles non-IID graph data in Federated Graph Learning by concurrently extracting structural properties and node features and adaptively fusing them via a learning-driven pipeline. It introduces Parallel Characteristics Extraction (PCE) to derive cluster-specific structural models and a common node model from a feature-based topology, followed by Graph Characteristics Fusion (GCF) that uses a Multi-Armed Bandit to select an optimal fusion ratio. Across three benchmark datasets, FedGCF achieves 4.94%–7.24% higher test accuracy than strong baselines and reduces communication costs by 64.18%–81.25% to reach the same performance, with robust behavior under IID and non-IID distributions and strong performance on highly heterogeneous data (MIX). The approach demonstrates that adaptive, characteristic-aware fusion of global structure and local node information can substantially improve both accuracy and efficiency in Federated Graph Learning.

Abstract

Federated Graph Learning (FGL) has demonstrated the advantage of training a global Graph Neural Network (GNN) model across distributed clients using their local graph data. Unlike Euclidean data (\eg, images), graph data is composed of nodes and edges, where the overall node-edge connections determine the topological structure, and individual nodes along with their neighbors capture local node features. However, existing studies tend to prioritize one aspect over the other, leading to an incomplete understanding of the data and the potential misidentification of key characteristics across varying graph scenarios. Additionally, the non-independent and identically distributed (non-IID) nature of graph data makes the extraction of these two data characteristics even more challenging. To address the above issues, we propose a novel FGL framework, named FedGCF, which aims to simultaneously extract and fuse structural properties and node features to effectively handle diverse graph scenarios. FedGCF first clusters clients by structural similarity, performing model aggregation within each cluster to form the shared structural model. Next, FedGCF selects the clients with common node features and aggregates their models to generate a common node model. This model is then propagated to all clients, allowing common node features to be shared. By combining these two models with a proper ratio, FedGCF can achieve a comprehensive understanding of the graph data and deliver better performance, even under non-IID distributions. Experimental results show that FedGCF improves accuracy by 4.94%-7.24% under different data distributions and reduces communication cost by 64.18%-81.25% to reach the same accuracy compared to baselines.

Enhancing Federated Graph Learning via Adaptive Fusion of Structural and Node Characteristics

TL;DR

FedGCF tackles non-IID graph data in Federated Graph Learning by concurrently extracting structural properties and node features and adaptively fusing them via a learning-driven pipeline. It introduces Parallel Characteristics Extraction (PCE) to derive cluster-specific structural models and a common node model from a feature-based topology, followed by Graph Characteristics Fusion (GCF) that uses a Multi-Armed Bandit to select an optimal fusion ratio. Across three benchmark datasets, FedGCF achieves 4.94%–7.24% higher test accuracy than strong baselines and reduces communication costs by 64.18%–81.25% to reach the same performance, with robust behavior under IID and non-IID distributions and strong performance on highly heterogeneous data (MIX). The approach demonstrates that adaptive, characteristic-aware fusion of global structure and local node information can substantially improve both accuracy and efficiency in Federated Graph Learning.

Abstract

Federated Graph Learning (FGL) has demonstrated the advantage of training a global Graph Neural Network (GNN) model across distributed clients using their local graph data. Unlike Euclidean data (\eg, images), graph data is composed of nodes and edges, where the overall node-edge connections determine the topological structure, and individual nodes along with their neighbors capture local node features. However, existing studies tend to prioritize one aspect over the other, leading to an incomplete understanding of the data and the potential misidentification of key characteristics across varying graph scenarios. Additionally, the non-independent and identically distributed (non-IID) nature of graph data makes the extraction of these two data characteristics even more challenging. To address the above issues, we propose a novel FGL framework, named FedGCF, which aims to simultaneously extract and fuse structural properties and node features to effectively handle diverse graph scenarios. FedGCF first clusters clients by structural similarity, performing model aggregation within each cluster to form the shared structural model. Next, FedGCF selects the clients with common node features and aggregates their models to generate a common node model. This model is then propagated to all clients, allowing common node features to be shared. By combining these two models with a proper ratio, FedGCF can achieve a comprehensive understanding of the graph data and deliver better performance, even under non-IID distributions. Experimental results show that FedGCF improves accuracy by 4.94%-7.24% under different data distributions and reduces communication cost by 64.18%-81.25% to reach the same accuracy compared to baselines.

Paper Structure

This paper contains 17 sections, 7 equations, 9 figures, 4 tables, 2 algorithms.

Figures (9)

  • Figure 1: The impact of different data characteristics on two datasets.
  • Figure 2: Illustration of the Graph Characteristics Extraction and Fusion step in FedGCF. (a) FedGCF executes the structural properties extraction process, where clients are clustered based on structural properties (i.e., client shape). The clustering results will be used to aggregate the shared structural model $\omega_S$ for different clusters. (b) FedGCF executes the node features extraction process, where the clients with the common node features are selected (clients with arrows passing through) based on node features (i.e., client color) to aggregate the common node model $\omega_N$. After both characteristics are extracted, they are fused based on the combination ratio to perform the prediction task.
  • Figure 3: The test accuracy and communication cost for FedGCF and baselines on the Small Molecules dataset.
  • Figure 4: The test accuracy and communication cost for FedGCF and baselines on the Social Networks dataset (the baseline below the accuracy threshold is marked with the cross).
  • Figure 5: The test accuracy and communication cost for FedGCF and baselines on the Mix dataset (the baselines below the accuracy threshold are marked with the cross).
  • ...and 4 more figures