Table of Contents
Fetching ...

Acyclic Graph Pattern Counting under Local Differential Privacy

Yihua Hu, Kuncan Wang, Wei Dong

Abstract

Graph pattern counting serves as a cornerstone of network analysis with extensive real-world applications. Its integration with local differential privacy (LDP) has gained growing attention for protecting sensitive graph information in decentralized settings. However, existing LDP frameworks are largely ad hoc, offering solutions only for specific patterns such as triangles and stars. A general mechanism for counting arbitrary graph patterns, even for the subclass of acyclic patterns, has remained an open problem. To fill this gap, we present the first general solution for counting arbitrary acyclic patterns under LDP. We identify and tackle two fundamental challenges: generalizing pattern construction from distributed data and eliminating node duplication during the construction. To address the first challenge, we propose an LDP-tailored recursive subpattern counting framework that incrementally builds patterns across multiple communication rounds. For the second challenge, we apply a random marking technique that restricts each node to a unique position in the pattern during computation. Our mechanism achieves strong utility guarantees: for any acyclic graph pattern with $k$ edges, we achieve an additive error of $\tilde{O}(\sqrt{N}d(G)^k)$, where $N$ is the number of nodes and $d(G)$ is the maximum degree of the input graph $G$. Experiments on real-world graph datasets across multiple types of acyclic patterns demonstrate that our mechanisms achieve up to $46$-$2600\times$ improvement in utility and $300$-$650\times$ reduction in communication cost compared to the baseline methods.

Acyclic Graph Pattern Counting under Local Differential Privacy

Abstract

Graph pattern counting serves as a cornerstone of network analysis with extensive real-world applications. Its integration with local differential privacy (LDP) has gained growing attention for protecting sensitive graph information in decentralized settings. However, existing LDP frameworks are largely ad hoc, offering solutions only for specific patterns such as triangles and stars. A general mechanism for counting arbitrary graph patterns, even for the subclass of acyclic patterns, has remained an open problem. To fill this gap, we present the first general solution for counting arbitrary acyclic patterns under LDP. We identify and tackle two fundamental challenges: generalizing pattern construction from distributed data and eliminating node duplication during the construction. To address the first challenge, we propose an LDP-tailored recursive subpattern counting framework that incrementally builds patterns across multiple communication rounds. For the second challenge, we apply a random marking technique that restricts each node to a unique position in the pattern during computation. Our mechanism achieves strong utility guarantees: for any acyclic graph pattern with edges, we achieve an additive error of , where is the number of nodes and is the maximum degree of the input graph . Experiments on real-world graph datasets across multiple types of acyclic patterns demonstrate that our mechanisms achieve up to - improvement in utility and - reduction in communication cost compared to the baseline methods.
Paper Structure (49 sections, 32 theorems, 82 equations, 7 figures, 6 tables, 4 algorithms)

This paper contains 49 sections, 32 theorems, 82 equations, 7 figures, 6 tables, 4 algorithms.

Key Result

lemma 1

Let $\mathcal{M}: \mathcal{G} \to \mathcal{K}$ be a randomized mechanism that satisfies $\varepsilon$-DP. Then for any (possibly randomized) function $\mathcal{F}: \mathcal{K} \to \mathcal{Z}$, the mechanism $\mathcal{F} \circ \mathcal{M}: \mathcal{G} \to \mathcal{Z}$ also satisfies $\varepsilon$-DP

Figures (7)

  • Figure 1: Illustration of the 2-line walk counting mechanism $\mathcal{M}_{\mathrm{2\text{-}wk}}$ applied to graph $G$ with privacy budget $\varepsilon = 4$. For simplicity, $\sum$ denotes $\sum_{j \in \mathcal{N}(i)}$.
  • Figure 2: Illustration of the $3$-line path counting mechanism $\mathcal{M}_{\mathrm{3\text{-}pt}}$ applied to graph $G$ with privacy budget $\varepsilon =1$. For simplicity, $\sum$ denotes $\sum_{j \in \mathcal{N}(i)}$. The initialization step occurs before random marking, but is shown afterward for clarity.
  • Figure 3: An example of pre-processing a $4$-edge pattern.
  • Figure 4: Selected input patterns and their tree formulations with vertex orderings.
  • Figure 5: Relative error (%) for $Q_{5\text{-wk}}$, $Q_{5\text{-pt}}$, $Q_{\pi_1}$, and $Q_{4\star}$ on the AstroPh dataset under varying $\varepsilon$ values.
  • ...and 2 more figures

Theorems & Definitions (48)

  • definition 1: Differential Privacy
  • lemma 1: Post-Processing
  • lemma 2: Basic Composition
  • lemma 3: Parallel Composition
  • definition 2: Laplace Mechanism dwork2006calibrating
  • lemma 4: Concentration Bound of Laplace Distributions chan11continual
  • lemma 5: Multiplicative Chernoff Bound doerr2019probabilistic
  • lemma 6: Chebyshev’s inequality chebyshev1867valeurs
  • theorem 1
  • theorem 2
  • ...and 38 more