Table of Contents
Fetching ...

GRAIN: Exact Graph Reconstruction from Gradients

Maria Drencheva, Ivo Petrov, Maximilian Baader, Dimitar I. Dimitrov, Martin Vechev

TL;DR

GRAIN reveals a tangible privacy risk in federated learning for graph-structured data by enabling exact gradient inversion on GNNs. It exploits the low-rank structure of gradient updates in linear layers and introduces a span-checking framework tailored to GCNs and GATs, enabling subgraph-level recovery that is iteratively refined and assembled via DFS. The authors define Graph Similarity Metrics (GSM) to quantify reconstruction quality across multi-hop neighborhoods and demonstrate that GRAIN achieves up to 80% exact graph reconstructions and high partial recovery across chemical, citation, and social datasets, often outperforming strong baselines. This work highlights significant privacy implications for FL with GNNs and motivates the development of defenses and more robust federated protocols to mitigate graph-structured data leakage.

Abstract

Federated learning claims to enable collaborative model training among multiple clients with data privacy by transmitting gradient updates instead of the actual client data. However, recent studies have shown the client privacy is still at risk due to the, so called, gradient inversion attacks which can precisely reconstruct clients' text and image data from the shared gradient updates. While these attacks demonstrate severe privacy risks for certain domains and architectures, the vulnerability of other commonly-used data types, such as graph-structured data, remain under-explored. To bridge this gap, we present GRAIN, the first exact gradient inversion attack on graph data in the honest-but-curious setting that recovers both the structure of the graph and the associated node features. Concretely, we focus on Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) -- two of the most widely used frameworks for learning on graphs. Our method first utilizes the low-rank structure of GNN gradients to efficiently reconstruct and filter the client subgraphs which are then joined to complete the input graph. We evaluate our approach on molecular, citation, and social network datasets using our novel metric. We show that GRAIN reconstructs up to 80% of all graphs exactly, significantly outperforming the baseline, which achieves up to 20% correctly positioned nodes.

GRAIN: Exact Graph Reconstruction from Gradients

TL;DR

GRAIN reveals a tangible privacy risk in federated learning for graph-structured data by enabling exact gradient inversion on GNNs. It exploits the low-rank structure of gradient updates in linear layers and introduces a span-checking framework tailored to GCNs and GATs, enabling subgraph-level recovery that is iteratively refined and assembled via DFS. The authors define Graph Similarity Metrics (GSM) to quantify reconstruction quality across multi-hop neighborhoods and demonstrate that GRAIN achieves up to 80% exact graph reconstructions and high partial recovery across chemical, citation, and social datasets, often outperforming strong baselines. This work highlights significant privacy implications for FL with GNNs and motivates the development of defenses and more robust federated protocols to mitigate graph-structured data leakage.

Abstract

Federated learning claims to enable collaborative model training among multiple clients with data privacy by transmitting gradient updates instead of the actual client data. However, recent studies have shown the client privacy is still at risk due to the, so called, gradient inversion attacks which can precisely reconstruct clients' text and image data from the shared gradient updates. While these attacks demonstrate severe privacy risks for certain domains and architectures, the vulnerability of other commonly-used data types, such as graph-structured data, remain under-explored. To bridge this gap, we present GRAIN, the first exact gradient inversion attack on graph data in the honest-but-curious setting that recovers both the structure of the graph and the associated node features. Concretely, we focus on Graph Convolutional Networks (GCN) and Graph Attention Networks (GAT) -- two of the most widely used frameworks for learning on graphs. Our method first utilizes the low-rank structure of GNN gradients to efficiently reconstruct and filter the client subgraphs which are then joined to complete the input graph. We evaluate our approach on molecular, citation, and social network datasets using our novel metric. We show that GRAIN reconstructs up to 80% of all graphs exactly, significantly outperforming the baseline, which achieves up to 20% correctly positioned nodes.

Paper Structure

This paper contains 60 sections, 9 theorems, 20 equations, 13 figures, 12 tables, 8 algorithms.

Key Result

Theorem 3.1

If $n < d$ and if the matrix $\tfrac{\partial\mathcal{L}}{\partial {\bm{Y}}^l}$ is of full rank, then $\mathop{\mathrm{rowspan}}\nolimits({\bm{X}}^l) = \mathop{\mathrm{colspan}}\nolimits(\tfrac{\partial\mathcal{L}}{\partial {\bm{W}}^l})$.

Figures (13)

  • Figure 1: Overview of GRAIN . GRAIN first recovers the input nodes $\mathcal{T}_0^*$ by filtering through the cross-product $\mathcal{T}_0$ of all possible feature values, e.g., all atom types $\mathcal{F}_1$ and all number of bonds $\mathcal{F}_2$. It then iteratively combines and filters them into a set of larger building blocks $\mathcal{T}_B^*$ up to a degree $L$. Finally, it reconstructs the input graph by combining building blocks from $\mathcal{T}_B^*$ in a DFS manner.
  • Figure 2: Glueing visualization
  • Figure 3: Ablation studies on how the data and architecture affect reconstructability
  • Figure 4: Examples molecule reconstructions. Multivalent interactions are not recovered, as they are not considered by the GNN.
  • Figure 5: Examples of molecule reconstructions compared between GRAIN, DLG, and TabLeak.
  • ...and 8 more figures

Theorems & Definitions (14)

  • Theorem 3.1
  • Theorem 5.1
  • Corollary 5.1
  • Theorem 5.2
  • Theorem A.1
  • proof
  • Corollary A.0
  • proof
  • Theorem A.1
  • proof
  • ...and 4 more