GRAIN: Exact Graph Reconstruction from Gradients
Maria Drencheva, Ivo Petrov, Maximilian Baader, Dimitar I. Dimitrov, Martin Vechev
TL;DR
GRAIN reveals a tangible privacy risk in federated learning for graph-structured data by enabling exact gradient inversion on GNNs. It exploits the low-rank structure of gradient updates in linear layers and introduces a span-checking framework tailored to GCNs and GATs, enabling subgraph-level recovery that is iteratively refined and assembled via DFS. The authors define Graph Similarity Metrics (GSM) to quantify reconstruction quality across multi-hop neighborhoods and demonstrate that GRAIN achieves up to 80% exact graph reconstructions and high partial recovery across chemical, citation, and social datasets, often outperforming strong baselines. This work highlights significant privacy implications for FL with GNNs and motivates the development of defenses and more robust federated protocols to mitigate graph-structured data leakage.
Abstract
Federated learning claims to enable collaborative model training among multiple clients with data privacy by transmitting gradient updates instead of the actual client data. However, recent studies have shown the client privacy is still at risk due to the, so called, gradient inversion attacks which can precisely reconstruct clients' text and image data from the shared gradient updates. While these attacks demonstrate severe privacy risks for certain domains and architectures, the vulnerability of other commonly-used data types, such as graph-structured data, remain under-explored. To bridge this gap, we present GRAIN, the first exact gradient inversion attack on graph data in the honest-but-curious setting that recovers both the structure of the graph and the associated node features. Concretely, we focus on Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) -- two of the most widely used frameworks for learning on graphs. Our method first utilizes the low-rank structure of GNN gradients to efficiently reconstruct and filter the client subgraphs which are then joined to complete the input graph. We evaluate our approach on molecular, citation, and social network datasets using our novel metric. We show that GRAIN reconstructs up to 80% of all graphs exactly, significantly outperforming the baseline, which achieves up to 20% correctly positioned nodes.
