Table of Contents
Fetching ...

Graph Structure Learning with Privacy Guarantees for Open Graph Data

Muhao Guo, Jiaqi Wu, Yizheng Liao, Wenke Lee, Shengzhe Chen, Yang Weng

Abstract

Publishing open graph data while preserving individual privacy remains challenging when data publishers and data users are distinct entities. Although differential privacy (DP) provides rigorous guarantees, most existing approaches enforce privacy during model training rather than at the data publishing stage. This limits the applicability to open-data scenarios. We propose a privacy-preserving graph structure learning framework that integrates Gaussian Differential Privacy (GDP) directly into the data release process. Our mechanism injects structured Gaussian noise into raw data prior to publication and provides formal $μ$-GDP guarantees, leading to tight $(\varepsilon, δ)$-differential privacy bounds. Despite the distortion introduced by privatization, we prove that the original sparse inverse covariance structure can be recovered through an unbiased penalized likelihood formulation. We further extend the framework to discrete data using discrete Gaussian noise while preserving privacy guarantees. Extensive experiments on synthetic and real-world datasets demonstrate strong privacy-utility trade-offs, maintaining high graph recovery accuracy under rigorous privacy budgets. Our results establish a formal connection between differential privacy theory and privacy-preserving data publishing for graphical models.

Graph Structure Learning with Privacy Guarantees for Open Graph Data

Abstract

Publishing open graph data while preserving individual privacy remains challenging when data publishers and data users are distinct entities. Although differential privacy (DP) provides rigorous guarantees, most existing approaches enforce privacy during model training rather than at the data publishing stage. This limits the applicability to open-data scenarios. We propose a privacy-preserving graph structure learning framework that integrates Gaussian Differential Privacy (GDP) directly into the data release process. Our mechanism injects structured Gaussian noise into raw data prior to publication and provides formal -GDP guarantees, leading to tight -differential privacy bounds. Despite the distortion introduced by privatization, we prove that the original sparse inverse covariance structure can be recovered through an unbiased penalized likelihood formulation. We further extend the framework to discrete data using discrete Gaussian noise while preserving privacy guarantees. Extensive experiments on synthetic and real-world datasets demonstrate strong privacy-utility trade-offs, maintaining high graph recovery accuracy under rigorous privacy budgets. Our results establish a formal connection between differential privacy theory and privacy-preserving data publishing for graphical models.

Paper Structure

This paper contains 28 sections, 16 theorems, 44 equations, 6 figures, 2 tables, 2 algorithms.

Key Result

theorem 1

Let $\mathcal{M}$ be the Gaussian privatization mechanism defined in eq:encryptcov. Suppose the data matrix $X \in \mathbb{R}^{n \times p}$ satisfies the column-norm condition Then $\mathcal{M}$ satisfies $\mu$-Gaussian differential privacy ($\mu$-GDP) with where $\Delta_f \;=\; \sup_{X,X'} \| f(X) - f(X') \|$ is the global sensitivity of $f$ over all neighboring datasets $X$ and $X'$ that diffe

Figures (6)

  • Figure 1: The data publisher can publish privacy-preserving data without sacrificing the data users' analysis performance.
  • Figure 2: The cross-validation for selecting $\lambda$. The red circle represents the corresponding MSE of the selected $\lambda$.
  • Figure 3: Estimated adjacency matrices for an 11-bus power grid at 20 dB noise. Left to right: ground truth, G-Wishart, SCIO, Neighborhood Selection, our approach, and its privacy–utility curve.
  • Figure 4: ROC and AUC of our approach and vanilla Graphical Lasso on eight real-world datasets.
  • Figure 5: Cell signaling, Power system, Chickenpox, and Soil microbiome datasets.
  • ...and 1 more figures

Theorems & Definitions (25)

  • definition 1: $(\varepsilon, \delta)$-Differential Privacy
  • definition 2: Trade-off function $T$
  • definition 3: Gaussian differential privacy
  • theorem 1
  • theorem 2
  • theorem 3: Sparse graph estimation with continuous privacy-preserving data
  • corollary 1: Uniqueness of $\widehat{\boldsymbol{\Theta}}$
  • definition 4: Discrete Gaussiancanonne2020discrete
  • theorem 4
  • theorem 5: Sparse graph estimation with discrete privacy-preserving data
  • ...and 15 more