Table of Contents
Fetching ...

Local Node Differential Privacy

Sofya Raskhodnikova, Adam Smith, Connor Wagaman, Anatoly Zavyalov

TL;DR

This work initiates the study of Local Node Differential Privacy (LNDP*) for graphs, modeling each node as a party that privately reveals its incident edges to an untrusted server. The authors introduce the blurry degree distribution framework, which privately captures the graph's degree distribution with low sensitivity, enabling accurate, private linear queries via the factorization mechanism. They show that LNDP* can achieve central-model–like accuracy for several statistics (e.g., clique size, ER graph parameter, and certain edge counts) while proving tight lower bounds that reveal fundamental limits and a separation from the central model for some problems. The paper also develops novel lower-bound tools tailored to overlapping graph inputs (the splicing method) and structural results distinguishing pure versus approximate LNDP*, as well as unrestricted versus degrees-only LNDP*, with implications for interactive versus noninteractive settings. Overall, the work provides a foundational framework, near-optimal algorithms, and deep insights into the power and limitations of LNDP* for graph data, laying out open questions for finer separations and interaction effects.

Abstract

We initiate an investigation of node differential privacy for graphs in the local model of private data analysis. In our model, dubbed LNDP, each node sees its own edge list and releases the output of a local randomizer on this input. These outputs are aggregated by an untrusted server to obtain a final output. We develop a novel algorithmic framework for this setting that allows us to accurately answer arbitrary linear queries on a blurry approximation of the input graph's degree distribution. For some natural problems, the resulting algorithms match the accuracy achievable with node privacy in the central model, where data are held and processed by a trusted server. We also prove lower bounds on the error required by LNDP that imply the optimality of our algorithms for several fundamental graph statistics. We then lift these lower bounds to the interactive LNDP setting, demonstrating the optimality of our algorithms even when constantly many rounds of interaction are permitted. Obtaining our lower bounds requires new approaches, since those developed for the usual local model do not apply to the inherently overlapping inputs that arise from graphs. Finally, we prove structural results that reveal qualitative differences between local node privacy and the standard local model for tabular data.

Local Node Differential Privacy

TL;DR

This work initiates the study of Local Node Differential Privacy (LNDP*) for graphs, modeling each node as a party that privately reveals its incident edges to an untrusted server. The authors introduce the blurry degree distribution framework, which privately captures the graph's degree distribution with low sensitivity, enabling accurate, private linear queries via the factorization mechanism. They show that LNDP* can achieve central-model–like accuracy for several statistics (e.g., clique size, ER graph parameter, and certain edge counts) while proving tight lower bounds that reveal fundamental limits and a separation from the central model for some problems. The paper also develops novel lower-bound tools tailored to overlapping graph inputs (the splicing method) and structural results distinguishing pure versus approximate LNDP*, as well as unrestricted versus degrees-only LNDP*, with implications for interactive versus noninteractive settings. Overall, the work provides a foundational framework, near-optimal algorithms, and deep insights into the power and limitations of LNDP* for graph data, laying out open questions for finer separations and interaction effects.

Abstract

We initiate an investigation of node differential privacy for graphs in the local model of private data analysis. In our model, dubbed LNDP, each node sees its own edge list and releases the output of a local randomizer on this input. These outputs are aggregated by an untrusted server to obtain a final output. We develop a novel algorithmic framework for this setting that allows us to accurately answer arbitrary linear queries on a blurry approximation of the input graph's degree distribution. For some natural problems, the resulting algorithms match the accuracy achievable with node privacy in the central model, where data are held and processed by a trusted server. We also prove lower bounds on the error required by LNDP that imply the optimality of our algorithms for several fundamental graph statistics. We then lift these lower bounds to the interactive LNDP setting, demonstrating the optimality of our algorithms even when constantly many rounds of interaction are permitted. Obtaining our lower bounds requires new approaches, since those developed for the usual local model do not apply to the inherently overlapping inputs that arise from graphs. Finally, we prove structural results that reveal qualitative differences between local node privacy and the standard local model for tabular data.
Paper Structure (59 sections, 52 theorems, 138 equations, 3 figures, 2 tables, 9 algorithms)

This paper contains 59 sections, 52 theorems, 138 equations, 3 figures, 2 tables, 9 algorithms.

Key Result

Lemma 3.2

Let $\ell, u \in \mathbb{R}$ such that $\ell < u$. Let $G$ be a graph on node set $[n]$ with $m$ edges, where every node $i\in [n]$ has degree $d_i \in [\ell,u]$. Then $m = \frac{1}{2}\cdot\bigl((u - \ell)\cdot \mathsf{ST}_{\ell,u}(G) + n \ell\bigr).$

Figures (3)

  • Figure 1: The soft threshold function $\mathsf{st}_{\ell,r}$ on the left (see \ref{['def:deg-thresh-func']}) and the probabilities used by the randomized rounding function $R_s$ for creating the blurry degree distribution ${\widetilde{\mathsf{dd}}_{G}^{s}}$ on the right (see \ref{['def:blur-deg-dist-new']}). For the blurry degree distribution, the colored points denote its domain, and the triangle above each point $x$ in the domain corresponds to the probability $\Pr[R_s(d) = x]$ for each degree $d\in[x-s,x+s]$.
  • Figure 2: A $3$-starpartite graph on $n=8$ nodes, where the three starred nodes connect to all other nodes.
  • Figure 3: Depiction of the distributions of the non-private averages $\overline{b_j}$ (left) and private averages $\overline{a_j}$ (right) for starpartite graphs (blue) and regular graphs (red). The probability of the event $\overline{a_j} \in [0, 1]$ differs by $\Omega{( {\frac{1}{\sigma_\mathit{avg}^3}} )}$ between regular and starpartite graphs, so $s = O(\sigma_\mathit{avg}^6)$ samples suffice to distinguish these distributions.

Theorems & Definitions (105)

  • Definition 2.1: $(\varepsilon,\delta)$-indistinguishability
  • Definition 2.2: Differential privacy (DP) DworkMNS16
  • Definition 2.3: Noninteractive local node differential privacy ($\mathrm{LNDP}^\star$)
  • Definition 3.1: Soft threshold
  • Lemma 3.2: The soft threshold function counts edges
  • proof : Proof of \ref{['lem:st-counts-edges']}
  • Lemma 3.3
  • proof : Proof of \ref{['thm:deg-thresh-priv-acc']}
  • Theorem 3.4
  • proof : Proof of \ref{['thm:edge-ct-alg']}
  • ...and 95 more