Federated Graph Learning with Adaptive Importance-based Sampling

Anran Li; Yuanyuan Chen; Chao Ren; Wenhan Wang; Ming Hu; Tianlin Li; Han Yu; Qingyu Chen

Federated Graph Learning with Adaptive Importance-based Sampling

Anran Li, Yuanyuan Chen, Chao Ren, Wenhan Wang, Ming Hu, Tianlin Li, Han Yu, Qingyu Chen

TL;DR

The Federated Adaptive Importance-based Sampling (FedAIS) approach achieves substantial computational cost saving by focusing the limited resources on training important nodes, while reducing communication overhead via adaptive historical embedding synchronization.

Abstract

For privacy-preserving graph learning tasks involving distributed graph datasets, federated learning (FL)-based GCN (FedGCN) training is required. A key challenge for FedGCN is scaling to large-scale graphs, which typically incurs high computation and communication costs when dealing with the explosively increasing number of neighbors. Existing graph sampling-enhanced FedGCN training approaches ignore graph structural information or dynamics of optimization, resulting in high variance and inaccurate node embeddings. To address this limitation, we propose the Federated Adaptive Importance-based Sampling (FedAIS) approach. It achieves substantial computational cost saving by focusing the limited resources on training important nodes, while reducing communication overhead via adaptive historical embedding synchronization. The proposed adaptive importance-based sampling method jointly considers the graph structural heterogeneity and the optimization dynamics to achieve optimal trade-off between efficiency and accuracy. Extensive evaluations against five state-of-the-art baselines on five real-world graph datasets show that FedAIS achieves comparable or up to 3.23% higher test accuracy, while saving communication and computation costs by 91.77% and 85.59%.

Federated Graph Learning with Adaptive Importance-based Sampling

TL;DR

Abstract

Paper Structure (18 sections, 3 theorems, 13 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 18 sections, 3 theorems, 13 equations, 7 figures, 2 tables, 1 algorithm.

Introduction
Related Work
FedGCN Training
Sampling-based GCN Training
Problem Formulation
Our FedAIS Approach
Joint Analysis of Variance and Overhead
System Overview
Historical Embedding-based Graph Sampling
Adaptive Embedding Synchronization
Implementation
Convergence Analysis
Experiment Evaluation
Experimental Settings
Results and Discussions
...and 3 more sections

Key Result

Theorem 1

Under Assumption ass:Lipschitz1, if for all $v\in V_k$ and all $l\in \{1,2, \cdots, L-1\}$, the final output error of layer $L$ in training round $t\in [T]$ is bounded by:

Figures (7)

Figure 1: Test accuracy vs. communication costs.
Figure 2: System overview of FedAIS. $\textcircled{1}$-$\textcircled{2}$ cross-client neighbor embeddings, $\textcircled{3}$ local model, $\textcircled{4}$ global model $\theta_{t+1}$.
Figure 3: Accuracy scores with sizes of total communication cost for training different FedGCN models.
Figure 4: The total computation and communication costs for training various FedGCN models.
Figure 5: Model performance vs. various ablation baselines.
...and 2 more figures

Theorems & Definitions (3)

Theorem 1
Theorem 2
Theorem 3

Federated Graph Learning with Adaptive Importance-based Sampling

TL;DR

Abstract

Federated Graph Learning with Adaptive Importance-based Sampling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (3)