Table of Contents
Fetching ...

Decoupled Subgraph Federated Learning

Javad Aliakbari, Johan Östman, Alexandre Graell i Amat

TL;DR

FedStruct tackles subgraph federated learning for node classification by decoupling node feature embeddings from global structural information. It leverages explicit global graph structure through a decoupled GCN to produce node structure embeddings (NSEs) without sharing raw features, and introduces Hop2Vec to generate task-aware NSFs. The framework achieves near-central performance across six datasets, including heterophilic graphs, and remains robust to varying numbers of clients and partitioning schemes while maintaining manageable communication. This privacy-preserving approach substantially narrows the privacy-utility-communication trade-off in distributed graph learning, offering practical applicability in domains with sensitive relational data.

Abstract

We address the challenge of federated learning on graph-structured data distributed across multiple clients. Specifically, we focus on the prevalent scenario of interconnected subgraphs, where interconnections between different clients play a critical role. We present a novel framework for this scenario, named FedStruct, that harnesses deep structural dependencies. To uphold privacy, unlike existing methods, FedStruct eliminates the necessity of sharing or generating sensitive node features or embeddings among clients. Instead, it leverages explicit global graph structure information to capture inter-node dependencies. We validate the effectiveness of FedStruct through experimental results conducted on six datasets for semi-supervised node classification, showcasing performance close to the centralized approach across various scenarios, including different data partitioning methods, varying levels of label availability, and number of clients.

Decoupled Subgraph Federated Learning

TL;DR

FedStruct tackles subgraph federated learning for node classification by decoupling node feature embeddings from global structural information. It leverages explicit global graph structure through a decoupled GCN to produce node structure embeddings (NSEs) without sharing raw features, and introduces Hop2Vec to generate task-aware NSFs. The framework achieves near-central performance across six datasets, including heterophilic graphs, and remains robust to varying numbers of clients and partitioning schemes while maintaining manageable communication. This privacy-preserving approach substantially narrows the privacy-utility-communication trade-off in distributed graph learning, offering practical applicability in domains with sensitive relational data.

Abstract

We address the challenge of federated learning on graph-structured data distributed across multiple clients. Specifically, we focus on the prevalent scenario of interconnected subgraphs, where interconnections between different clients play a critical role. We present a novel framework for this scenario, named FedStruct, that harnesses deep structural dependencies. To uphold privacy, unlike existing methods, FedStruct eliminates the necessity of sharing or generating sensitive node features or embeddings among clients. Instead, it leverages explicit global graph structure information to capture inter-node dependencies. We validate the effectiveness of FedStruct through experimental results conducted on six datasets for semi-supervised node classification, showcasing performance close to the centralized approach across various scenarios, including different data partitioning methods, varying levels of label availability, and number of clients.
Paper Structure (39 sections, 5 theorems, 57 equations, 6 figures, 9 tables, 4 algorithms)

This paper contains 39 sections, 5 theorems, 57 equations, 6 figures, 9 tables, 4 algorithms.

Key Result

Proposition 1

For client $i$, training FedStruct, i.e., computing $\{\hat{\boldsymbol{y}}_{v},\; \forall v \in \tilde{\mathcal{V}}_i\}$ and $\nabla_{\bm{\theta}} \mathcal{L}_i(\bm{\theta})$, requires only the local partition $\boldsymbol{\Bar{A}}^{ [i] }$ and $\bm{S}$.

Figures (6)

  • Figure 1: Node classification accuracy. For all datasets, FedStruct exhibits performance close to the centralized setting (Central GNN).
  • Figure 2: General design of the FedStruct framework. Global Graph: underlying graph consisting of interconnected subgraphs. Local graphs: clients' subgraphs augmented with external nodes (without features or labels). Structure encoding: Generates node structure features for each node and shares them with other clients. Augmented local graphs: Generate node feature embeddings and node structure embeddings. Federated learning: Federated learning step exploiting node feature embeddings and node structure embeddings.
  • Figure 3: Comparison between the $L$-hop combined adjacency matrix of the original dataset and the pruned version with $p = 30$. Although some details are lost in the pruning, the main community structure of the graph is preserved.
  • Figure 4: (a) Accuracy vs training-ratio for Cora with random partitioning and $10$ clients; (b) Accuracy vs number of propagation layers on Cora with K-means partitioning and 5 clients; (c) Accuracy vs number of clients on Chameleon with random partitioning; (d) Accuracy on Chameleon with 10 clients for various partitioning methods.
  • Figure 5: FedStruct framework when the server has knowledge of the global graph’s connections. Global Graph: underlying graph consisting of interconnected subgraphs. Local graphs: clients' subgraphs augmented with external nodes (without features or labels). Structure encoding: The server generates node structure features and node structure embeddings for each node and shares them with the clients. Augmented local graphs: Generate node feature embeddings. Federated learning: Federated learning step exploiting node feature embeddings and node structure embeddings.
  • ...and 1 more figures

Theorems & Definitions (8)

  • Proposition 1
  • Proposition 2
  • proof
  • Lemma 1
  • proof
  • Proposition 3
  • proof
  • Proposition 4