Table of Contents
Fetching ...

Communication-Efficient Triangle Counting under Local Differential Privacy

Jacob Imola, Takao Murakami, Kamalika Chaudhuri

TL;DR

This work tackles triangle counting under local differential privacy with the challenge of balancing privacy, accuracy, and communication. It introduces two-round algorithms that combine edge randomized response with edge sampling, plus a 4-cycle trick to reduce correlated errors, enabling significant communication reductions compared to prior dense noisy graphs. To further reduce sensitivity and improve utility, it presents a double clipping technique that privately bounds both user degree and the number of noisy triangles, yielding substantial gains in practical scenarios. The experimental results on large real graphs (Google+ and IMDB) and synthetic BA graphs show dramatic download-time reductions (e.g., from hours to seconds at 20 Mbps) while maintaining small relative error, and demonstrate the effectiveness of ARROneNS$_{\triangle}$ with double clipping for large or dense graphs. The work also extends the privacy analysis to an $(\varepsilon,\delta)$-DP setting with a triangle-excess bound, and discusses implications for clustering coefficient estimation and practical deployments.

Abstract

Triangle counting in networks under LDP (Local Differential Privacy) is a fundamental task for analyzing connection patterns or calculating a clustering coefficient while strongly protecting sensitive friendships from a central server. In particular, a recent study proposes an algorithm for this task that uses two rounds of interaction between users and the server to significantly reduce estimation error. However, this algorithm suffers from a prohibitively high communication cost due to a large noisy graph each user needs to download. In this work, we propose triangle counting algorithms under LDP with a small estimation error and communication cost. We first propose two-rounds algorithms consisting of edge sampling and carefully selecting edges each user downloads so that the estimation error is small. Then we propose a double clipping technique, which clips the number of edges and then the number of noisy triangles, to significantly reduce the sensitivity of each user's query. Through comprehensive evaluation, we show that our algorithms dramatically reduce the communication cost of the existing algorithm, e.g., from 6 hours to 8 seconds or less at a 20 Mbps download rate, while keeping a small estimation error.

Communication-Efficient Triangle Counting under Local Differential Privacy

TL;DR

This work tackles triangle counting under local differential privacy with the challenge of balancing privacy, accuracy, and communication. It introduces two-round algorithms that combine edge randomized response with edge sampling, plus a 4-cycle trick to reduce correlated errors, enabling significant communication reductions compared to prior dense noisy graphs. To further reduce sensitivity and improve utility, it presents a double clipping technique that privately bounds both user degree and the number of noisy triangles, yielding substantial gains in practical scenarios. The experimental results on large real graphs (Google+ and IMDB) and synthetic BA graphs show dramatic download-time reductions (e.g., from hours to seconds at 20 Mbps) while maintaining small relative error, and demonstrate the effectiveness of ARROneNS with double clipping for large or dense graphs. The work also extends the privacy analysis to an -DP setting with a triangle-excess bound, and discusses implications for clustering coefficient estimation and practical deployments.

Abstract

Triangle counting in networks under LDP (Local Differential Privacy) is a fundamental task for analyzing connection patterns or calculating a clustering coefficient while strongly protecting sensitive friendships from a central server. In particular, a recent study proposes an algorithm for this task that uses two rounds of interaction between users and the server to significantly reduce estimation error. However, this algorithm suffers from a prohibitively high communication cost due to a large noisy graph each user needs to download. In this work, we propose triangle counting algorithms under LDP with a small estimation error and communication cost. We first propose two-rounds algorithms consisting of edge sampling and carefully selecting edges each user downloads so that the estimation error is small. Then we propose a double clipping technique, which clips the number of edges and then the number of noisy triangles, to significantly reduce the sensitivity of each user's query. Through comprehensive evaluation, we show that our algorithms dramatically reduce the communication cost of the existing algorithm, e.g., from 6 hours to 8 seconds or less at a 20 Mbps download rate, while keeping a small estimation error.

Paper Structure

This paper contains 34 sections, 9 theorems, 47 equations, 17 figures, 3 tables, 2 algorithms.

Key Result

Proposition 1

If each of local randomizers $\mathcal{R}\space_1, \ldots, \mathcal{R}\space_n$ provides $\varepsilon$-edge LDP, then $(\mathcal{R}\space_1, \ldots, \mathcal{R}\space_n)$ provides $2\varepsilon$-relationship DP. Additionally, if each $\mathcal{R}\space_i$ uses only bits $a_{i,1}, \ldots, a_{i,i-1}$

Figures (17)

  • Figure 1: Triangles, $2$-stars, and clustering coefficient.
  • Figure 2: Overview of our communication-efficient triangle counting algorithms ($p_1 =\frac{e^{\varepsilon}}{e^{\varepsilon}+1}$, $p_2 \in [0,1]$).
  • Figure 3: Noisy edges to download in our three algorithms.
  • Figure 5: Noisy triangles involving edge $(v_i,v_j)$ counted by user $v_i$ ($j<k,l,m<i$).
  • Figure 7: Relative error of our three algorithms with double clipping ("DC") when $\varepsilon=1$ or $2$ and $\mu^*=10^{-3}$ ($n=107614$ in Gplus, $n=896308$ in IMDB).
  • ...and 12 more figures

Theorems & Definitions (17)

  • Definition 1: $\varepsilon$-edge LDP Qin_CCS17
  • Definition 2: $\varepsilon$-relationship DP Imola_USENIX21
  • Proposition 1: Edge LDP and relationship DP Imola_USENIX21
  • Proposition 2: Sequential composition of edge LDP
  • Definition 3
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Lemma 1
  • ...and 7 more