Table of Contents
Fetching ...

Tackling the Local Bias in Federated Graph Learning

Binchi Zhang, Minnan Luo, Shangbin Feng, Ziqi Liu, Jun Zhou, Qinghua Zheng

TL;DR

The paper tackles local bias in subgraph-level federated graph learning by identifying underutilization of cross-client edges and distribution divergence across clients. It introduces a distributed FGL framework that fully leverages cross-client edges to mimic centralized training, paired with a label-guided subgraph sampling strategy to balance data and reduce overhead. The authors provide an unbiased estimator for hidden representations and a convergence guarantee under a ρ-smooth loss, plus extensive experiments showing improved performance and faster convergence with lower time and memory costs. This approach advances scalable, privacy-preserving federated graph learning by aligning local models with centralized counterparts through principled cross-client information sharing.

Abstract

Federated graph learning (FGL) has become an important research topic in response to the increasing scale and the distributed nature of graph-structured data in the real world. In FGL, a global graph is distributed across different clients, where each client holds a subgraph. Existing FGL methods often fail to effectively utilize cross-client edges, losing structural information during the training; additionally, local graphs often exhibit significant distribution divergence. These two issues make local models in FGL less desirable than in centralized graph learning, namely the local bias problem in this paper. To solve this problem, we propose a novel FGL framework to make the local models similar to the model trained in a centralized setting. Specifically, we design a distributed learning scheme, fully leveraging cross-client edges to aggregate information from other clients. In addition, we propose a label-guided sampling approach to alleviate the imbalanced local data and meanwhile, distinctly reduce the training overhead. Extensive experiments demonstrate that local bias can compromise the model performance and slow down the convergence during training. Experimental results also verify that our framework successfully mitigates local bias, achieving better performance than other baselines with lower time and memory overhead.

Tackling the Local Bias in Federated Graph Learning

TL;DR

The paper tackles local bias in subgraph-level federated graph learning by identifying underutilization of cross-client edges and distribution divergence across clients. It introduces a distributed FGL framework that fully leverages cross-client edges to mimic centralized training, paired with a label-guided subgraph sampling strategy to balance data and reduce overhead. The authors provide an unbiased estimator for hidden representations and a convergence guarantee under a ρ-smooth loss, plus extensive experiments showing improved performance and faster convergence with lower time and memory costs. This approach advances scalable, privacy-preserving federated graph learning by aligning local models with centralized counterparts through principled cross-client information sharing.

Abstract

Federated graph learning (FGL) has become an important research topic in response to the increasing scale and the distributed nature of graph-structured data in the real world. In FGL, a global graph is distributed across different clients, where each client holds a subgraph. Existing FGL methods often fail to effectively utilize cross-client edges, losing structural information during the training; additionally, local graphs often exhibit significant distribution divergence. These two issues make local models in FGL less desirable than in centralized graph learning, namely the local bias problem in this paper. To solve this problem, we propose a novel FGL framework to make the local models similar to the model trained in a centralized setting. Specifically, we design a distributed learning scheme, fully leveraging cross-client edges to aggregate information from other clients. In addition, we propose a label-guided sampling approach to alleviate the imbalanced local data and meanwhile, distinctly reduce the training overhead. Extensive experiments demonstrate that local bias can compromise the model performance and slow down the convergence during training. Experimental results also verify that our framework successfully mitigates local bias, achieving better performance than other baselines with lower time and memory overhead.

Paper Structure

This paper contains 27 sections, 2 theorems, 12 equations, 6 figures, 5 tables, 1 algorithm.

Key Result

proposition thmcounterproposition

Matrix ${\boldsymbol{H}}_{S_i}^l$ defined in eq:subgraph estimator is an unbiased estimator of hidden representation matrix ${\boldsymbol{H}}_i^l$, i.e., $\mathbb{E}[{\boldsymbol{H}}_{S_i}^l]={\boldsymbol{H}}_i^l$.

Figures (6)

  • Figure 1: The comparison of our proposed framework with existing FGL frameworks on a toy example where the global graph is stored on two clients. Previous frameworks cut off the cross-client edge between Client 1 and Client 2. Our framework leverages the cross-client edges in the learning process while preserving data privacy.
  • Figure 2: Illustration of our FGL framework where red edges denote the cross-client edges, ${\boldsymbol{Z}}$ denotes the hidden matrix, ${\boldsymbol{O}}$ and ${\boldsymbol{P}}$ denote the local terms in the inference stage and the backpropagation stage, respectively. First, each client independently samples a node subset based on the label distribution, which forms a global subgraph across all clients. Then, each client performs our proposed distributed inference method with our communication scheme for $L$ layers and obtains the output. Finally, distributed backpropagation is conducted similarly to finish the training process.
  • Figure 3: Impact of cross-client edges on local bias.
  • Figure 4: The impact of imbalanced local distribution on the local bias problem.
  • Figure 5: Local computation time and communication overhead with different sample sizes on Reddit.
  • ...and 1 more figures

Theorems & Definitions (4)

  • proposition thmcounterproposition
  • theorem thmcountertheorem
  • proof
  • proof