FedTAD: Topology-aware Data-free Knowledge Distillation for Subgraph Federated Learning
Yinlin Zhu, Xunkai Li, Zhengyu Wu, Di Wu, Miao Hu, Rong-Hua Li
TL;DR
This work targets subgraph-FL, where heterogeneity from node and topology variations degrades a global GNN. It decouples these variations and links them to differences in label distributions and class-wise homophily, which affect the reliability of class-wise predictions from local models. To address this, FedTAD introduces topology-aware data-free knowledge distillation: clients compute topology-aware embeddings to quantify class-wise reliability, while the server builds a pseudo-graph with a generator to distill reliable knowledge into the global model via an adversarial-like objective. Across six public datasets, FedTAD consistently improves over strong baselines (up to ~5% gains) and serves as a robust, plug-in enhancement for subgraph-FL optimization strategies, demonstrating practical benefits for distributed graph learning under heterogeneity.
Abstract
Subgraph federated learning (subgraph-FL) is a new distributed paradigm that facilitates the collaborative training of graph neural networks (GNNs) by multi-client subgraphs. Unfortunately, a significant challenge of subgraph-FL arises from subgraph heterogeneity, which stems from node and topology variation, causing the impaired performance of the global GNN. Despite various studies, they have not yet thoroughly investigated the impact mechanism of subgraph heterogeneity. To this end, we decouple node and topology variation, revealing that they correspond to differences in label distribution and structure homophily. Remarkably, these variations lead to significant differences in the class-wise knowledge reliability of multiple local GNNs, misguiding the model aggregation with varying degrees. Building on this insight, we propose topology-aware data-free knowledge distillation technology (FedTAD), enhancing reliable knowledge transfer from the local model to the global model. Extensive experiments on six public datasets consistently demonstrate the superiority of FedTAD over state-of-the-art baselines.
