HoGS: Homophily-Oriented Graph Synthesis for Local Differentially Private GNN Training

Wen Xu; Zhetao Li; Yong Xiao; Pengpeng Qiao; Mianxiong Dong; Kaoru Ota

HoGS: Homophily-Oriented Graph Synthesis for Local Differentially Private GNN Training

Wen Xu, Zhetao Li, Yong Xiao, Pengpeng Qiao, Mianxiong Dong, Kaoru Ota

TL;DR

HoGS addresses privacy risks in decentralized GNN training by generating a synthetic graph under $ε$-LDP that protects both links and node features. It privately collects perturbed adjacency lists and features, then uses homophily to perform bidirectional reconstruction: topology is inferred by Bayesian estimation with cosine similarity of noisy features, and features are denoised via a weighted aggregation over probable neighbors. The method guarantees $ε$-LDP through a split budget ($ε_a$, $ε_f$) with $ε_a=(1-δ)ε$ and $ε_f=δε$, and treats topology/feature reconstruction as post-processing. Empirical results on Cora, CiteSeer, and LastFM show HoGS substantially outperforms baselines across GCN, GraphSAGE, and GAT, demonstrating robust privacy-utility tradeoffs and practical impact for privacy-preserving GNN training.

Abstract

Graph neural networks (GNNs) have demonstrated remarkable performance in various graph-based machine learning tasks by effectively modeling high-order interactions between nodes. However, training GNNs without protection may leak sensitive personal information in graph data, including links and node features. Local differential privacy (LDP) is an advanced technique for protecting data privacy in decentralized networks. Unfortunately, existing local differentially private GNNs either only preserve link privacy or suffer significant utility loss in the process of preserving link and node feature privacy. In this paper, we propose an effective LDP framework, called HoGS, which trains GNNs with link and feature protection by generating a synthetic graph. Concretely, HoGS first collects the link and feature information of the graph under LDP, and then utilizes the phenomenon of homophily in graph data to reconstruct the graph structure and node features separately, thereby effectively mitigating the negative impact of LDP on the downstream GNN training. We theoretically analyze the privacy guarantee of HoGS and conduct experiments using the generated synthetic graph as input to various state-of-the-art GNN architectures. Experimental results on three real-world datasets show that HoGS significantly outperforms baseline methods in the accuracy of training GNNs.

HoGS: Homophily-Oriented Graph Synthesis for Local Differentially Private GNN Training

TL;DR

HoGS addresses privacy risks in decentralized GNN training by generating a synthetic graph under

-LDP that protects both links and node features. It privately collects perturbed adjacency lists and features, then uses homophily to perform bidirectional reconstruction: topology is inferred by Bayesian estimation with cosine similarity of noisy features, and features are denoised via a weighted aggregation over probable neighbors. The method guarantees

-LDP through a split budget (

) with

and

, and treats topology/feature reconstruction as post-processing. Empirical results on Cora, CiteSeer, and LastFM show HoGS substantially outperforms baselines across GCN, GraphSAGE, and GAT, demonstrating robust privacy-utility tradeoffs and practical impact for privacy-preserving GNN training.

Abstract

Paper Structure (25 sections, 4 theorems, 12 equations, 9 figures, 3 tables, 3 algorithms)

This paper contains 25 sections, 4 theorems, 12 equations, 9 figures, 3 tables, 3 algorithms.

Introduction
Preliminaries
Graph Neural Networks
Local Differential Privacy
Formal Problem Definition
Our Approach: HoGS
Overview
Information Collection
Topology Reconstruction
Feature Reconstruction
Algorithm Analysis
Privacy Analysis
Computational Complexity
Technical Novelty
Evaluation
...and 10 more sections

Key Result

Theorem 1

Combining multiple sub-mechanisms that satisfy LDP for $\{\varepsilon_1,\cdots, \varepsilon_k\}$ results in a mechanism satisfying $\varepsilon$-LDP, where $\varepsilon = \sum_{i=1}^{k} \varepsilon_i$.

Figures (9)

Figure 1: The architecture for sensitive data collection and training in decentralized environment.
Figure 2: The example of reconstructing the graph under LDP guarantee.
Figure 3: The overview of our HoGS framework.
Figure 4: Performance comparison of HoGS with other methods on GCN model given different privacy budgets.
Figure 5: Performance comparison of HoGS with other methods on GraphSAGE model given different privacy budgets.
...and 4 more figures

Theorems & Definitions (9)

Definition 1: $\varepsilon$-LDP
Theorem 1: sequential composition mcsherry2009privacy
Theorem 2: post-processing dwork2006calibrating
Definition 2: $\varepsilon$-edge LDP
Definition 3: $\varepsilon$-feature LDP
Lemma 3
proof
Theorem 4
proof

HoGS: Homophily-Oriented Graph Synthesis for Local Differentially Private GNN Training

TL;DR

Abstract

HoGS: Homophily-Oriented Graph Synthesis for Local Differentially Private GNN Training

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (9)