Table of Contents
Fetching ...

FedTAD: Topology-aware Data-free Knowledge Distillation for Subgraph Federated Learning

Yinlin Zhu, Xunkai Li, Zhengyu Wu, Di Wu, Miao Hu, Rong-Hua Li

TL;DR

This work targets subgraph-FL, where heterogeneity from node and topology variations degrades a global GNN. It decouples these variations and links them to differences in label distributions and class-wise homophily, which affect the reliability of class-wise predictions from local models. To address this, FedTAD introduces topology-aware data-free knowledge distillation: clients compute topology-aware embeddings to quantify class-wise reliability, while the server builds a pseudo-graph with a generator to distill reliable knowledge into the global model via an adversarial-like objective. Across six public datasets, FedTAD consistently improves over strong baselines (up to ~5% gains) and serves as a robust, plug-in enhancement for subgraph-FL optimization strategies, demonstrating practical benefits for distributed graph learning under heterogeneity.

Abstract

Subgraph federated learning (subgraph-FL) is a new distributed paradigm that facilitates the collaborative training of graph neural networks (GNNs) by multi-client subgraphs. Unfortunately, a significant challenge of subgraph-FL arises from subgraph heterogeneity, which stems from node and topology variation, causing the impaired performance of the global GNN. Despite various studies, they have not yet thoroughly investigated the impact mechanism of subgraph heterogeneity. To this end, we decouple node and topology variation, revealing that they correspond to differences in label distribution and structure homophily. Remarkably, these variations lead to significant differences in the class-wise knowledge reliability of multiple local GNNs, misguiding the model aggregation with varying degrees. Building on this insight, we propose topology-aware data-free knowledge distillation technology (FedTAD), enhancing reliable knowledge transfer from the local model to the global model. Extensive experiments on six public datasets consistently demonstrate the superiority of FedTAD over state-of-the-art baselines.

FedTAD: Topology-aware Data-free Knowledge Distillation for Subgraph Federated Learning

TL;DR

This work targets subgraph-FL, where heterogeneity from node and topology variations degrades a global GNN. It decouples these variations and links them to differences in label distributions and class-wise homophily, which affect the reliability of class-wise predictions from local models. To address this, FedTAD introduces topology-aware data-free knowledge distillation: clients compute topology-aware embeddings to quantify class-wise reliability, while the server builds a pseudo-graph with a generator to distill reliable knowledge into the global model via an adversarial-like objective. Across six public datasets, FedTAD consistently improves over strong baselines (up to ~5% gains) and serves as a robust, plug-in enhancement for subgraph-FL optimization strategies, demonstrating practical benefits for distributed graph learning under heterogeneity.

Abstract

Subgraph federated learning (subgraph-FL) is a new distributed paradigm that facilitates the collaborative training of graph neural networks (GNNs) by multi-client subgraphs. Unfortunately, a significant challenge of subgraph-FL arises from subgraph heterogeneity, which stems from node and topology variation, causing the impaired performance of the global GNN. Despite various studies, they have not yet thoroughly investigated the impact mechanism of subgraph heterogeneity. To this end, we decouple node and topology variation, revealing that they correspond to differences in label distribution and structure homophily. Remarkably, these variations lead to significant differences in the class-wise knowledge reliability of multiple local GNNs, misguiding the model aggregation with varying degrees. Building on this insight, we propose topology-aware data-free knowledge distillation technology (FedTAD), enhancing reliable knowledge transfer from the local model to the global model. Extensive experiments on six public datasets consistently demonstrate the superiority of FedTAD over state-of-the-art baselines.
Paper Structure (25 sections, 11 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 11 equations, 5 figures, 4 tables, 1 algorithm.

Figures (5)

  • Figure 1: (a) Illustration of subgraph heterogeneity. $G_1$ and $G_2$ exhibit different node label distributions (i.e., node variation); $G_2$ and $G_3$ have the same node label distributions but significant differences in topological properties (i.e., topology variation). (b) Two data simulation methods of our empirical study. (c) Upper: label distribution under the node variation scenario, the deeper color corresponds to a larger number of nodes; Lower: class-wise homophily distribution under the topology variation scenario, the deeper color corresponds to a stronger class-wise homophily. (d) Performance of global models with/without Client 1 participation under the node variation scenario (upper) and the topology variation scenario (lower).
  • Figure 2: An overview of our proposed FedTAD framework. On the client side, each client performs local initialization for measuring class-wise knowledge reliability. On the server side, the FedTAD can be regarded as a post-process of vanilla model aggregation, which enhances reliable class-wise knowledge transferring from local models to the global model.
  • Figure 3: Convergence curves of our proposed FedTAD and baseline methods on four graph datasets with 10 participating clients.
  • Figure 4: Experimental results for the ablation study.
  • Figure 5: Sensitive analysis for two trade-off parameters $\lambda_1$ and $\lambda_2$.