Table of Contents
Fetching ...

Virtual Nodes Can Help: Tackling Distribution Shifts in Federated Graph Learning

Xingbo Fu, Zihan Chen, Yinhan He, Song Wang, Binchi Zhang, Chen Chen, Jundong Li

TL;DR

A novel FGL framework entitled FedVN is proposed that eliminates distribution shifts through client-specific graph augmentation strategies with multiple learnable Virtual Nodes (VNs) while training a global GNN model.

Abstract

Federated Graph Learning (FGL) enables multiple clients to jointly train powerful graph learning models, e.g., Graph Neural Networks (GNNs), without sharing their local graph data for graph-related downstream tasks, such as graph property prediction. In the real world, however, the graph data can suffer from significant distribution shifts across clients as the clients may collect their graph data for different purposes. In particular, graph properties are usually associated with invariant label-relevant substructures (i.e., subgraphs) across clients, while label-irrelevant substructures can appear in a client-specific manner. The issue of distribution shifts of graph data hinders the efficiency of GNN training and leads to serious performance degradation in FGL. To tackle the aforementioned issue, we propose a novel FGL framework entitled FedVN that eliminates distribution shifts through client-specific graph augmentation strategies with multiple learnable Virtual Nodes (VNs). Specifically, FedVN lets the clients jointly learn a set of shared VNs while training a global GNN model. To eliminate distribution shifts, each client trains a personalized edge generator that determines how the VNs connect local graphs in a client-specific manner. Furthermore, we provide theoretical analyses indicating that FedVN can eliminate distribution shifts of graph data across clients. Comprehensive experiments on four datasets under five settings demonstrate the superiority of our proposed FedVN over nine baselines.

Virtual Nodes Can Help: Tackling Distribution Shifts in Federated Graph Learning

TL;DR

A novel FGL framework entitled FedVN is proposed that eliminates distribution shifts through client-specific graph augmentation strategies with multiple learnable Virtual Nodes (VNs) while training a global GNN model.

Abstract

Federated Graph Learning (FGL) enables multiple clients to jointly train powerful graph learning models, e.g., Graph Neural Networks (GNNs), without sharing their local graph data for graph-related downstream tasks, such as graph property prediction. In the real world, however, the graph data can suffer from significant distribution shifts across clients as the clients may collect their graph data for different purposes. In particular, graph properties are usually associated with invariant label-relevant substructures (i.e., subgraphs) across clients, while label-irrelevant substructures can appear in a client-specific manner. The issue of distribution shifts of graph data hinders the efficiency of GNN training and leads to serious performance degradation in FGL. To tackle the aforementioned issue, we propose a novel FGL framework entitled FedVN that eliminates distribution shifts through client-specific graph augmentation strategies with multiple learnable Virtual Nodes (VNs). Specifically, FedVN lets the clients jointly learn a set of shared VNs while training a global GNN model. To eliminate distribution shifts, each client trains a personalized edge generator that determines how the VNs connect local graphs in a client-specific manner. Furthermore, we provide theoretical analyses indicating that FedVN can eliminate distribution shifts of graph data across clients. Comprehensive experiments on four datasets under five settings demonstrate the superiority of our proposed FedVN over nine baselines.
Paper Structure (48 sections, 26 equations, 6 figures, 3 tables, 1 algorithm)

This paper contains 48 sections, 26 equations, 6 figures, 3 tables, 1 algorithm.

Figures (6)

  • Figure 1: An overview of the proposed FedVN. FedVN aims to learn client-specific graph augmentation strategies by adding multiple virtual nodes through personalized edge generators so that the global GNN model can be trained over identical graphs.
  • Figure 2: Convergence curves of FedVN and other baselines on CMNIST/Color and SST2/Length.
  • Figure 3: (a) Performance of FedVN on SST2/Length with different values of $\lambda_1$ and $\lambda_2$. (b) Performance of FedVN with different numbers of VNs.
  • Figure 4: Pairwise cosine similarities between 10 VNs on Zinc/Scaffold.
  • Figure 5: Pairwise cosine similarities between 10 VNs on SST2/Length.
  • ...and 1 more figures

Theorems & Definitions (3)

  • proof
  • proof
  • proof