Revealing the True Cost of Locally Differentially Private Protocols: An Auditing Perspective

Héber H. Arcolezi; Sébastien Gambs

Revealing the True Cost of Locally Differentially Private Protocols: An Auditing Perspective

Héber H. Arcolezi, Sébastien Gambs

TL;DR

This work addresses the gap between theoretical guarantees and practical privacy in Local Differential Privacy by introducing LDP-Auditor, a framework that empirically estimates the local privacy loss $\epsilon_{emp}$ via distinguishability-style attacks on LDP frequency-estimation protocols. It covers pure and approximate LDP protocols, and extends auditing to longitudinal and multidimensional data settings with novel attacks $\mathcal{A}^L$ and $\mathcal{A}^{\text{RS+FD}}$, respectively. The authors conduct extensive experiments across nine protocols, reveal gaps between $\epsilon_{emp}$ and the theoretical $\epsilon$, detect a bug in a Python LDP package, and provide open-source tooling for practitioners. This framework supports more informed parameter selection and highlights directions for designing tighter, more robust LDP mechanisms in real-world deployments.

Abstract

While the existing literature on Differential Privacy (DP) auditing predominantly focuses on the centralized model (e.g., in auditing the DP-SGD algorithm), we advocate for extending this approach to audit Local DP (LDP). To achieve this, we introduce the LDP-Auditor framework for empirically estimating the privacy loss of locally differentially private mechanisms. This approach leverages recent advances in designing privacy attacks against LDP frequency estimation protocols. More precisely, through the analysis of numerous state-of-the-art LDP protocols, we extensively explore the factors influencing the privacy audit, such as the impact of different encoding and perturbation functions. Additionally, we investigate the influence of the domain size and the theoretical privacy loss parameters $ε$ and $δ$ on local privacy estimation. In-depth case studies are also conducted to explore specific aspects of LDP auditing, including distinguishability attacks on LDP protocols for longitudinal studies and multidimensional data. Finally, we present a notable achievement of our LDP-Auditor framework, which is the discovery of a bug in a state-of-the-art LDP Python package. Overall, our LDP-Auditor framework as well as our study offer valuable insights into the sources of randomness and information loss in LDP protocols. These contributions collectively provide a realistic understanding of the local privacy loss, which can help practitioners in selecting the LDP mechanism and privacy parameters that best align with their specific requirements. We open-sourced LDP-Auditor in \url{https://github.com/hharcolezi/ldp-audit}.

Revealing the True Cost of Locally Differentially Private Protocols: An Auditing Perspective

TL;DR

via distinguishability-style attacks on LDP frequency-estimation protocols. It covers pure and approximate LDP protocols, and extends auditing to longitudinal and multidimensional data settings with novel attacks

and

, respectively. The authors conduct extensive experiments across nine protocols, reveal gaps between

and the theoretical

, detect a bug in a Python LDP package, and provide open-source tooling for practitioners. This framework supports more informed parameter selection and highlights directions for designing tighter, more robust LDP mechanisms in real-world deployments.

Abstract

and

on local privacy estimation. In-depth case studies are also conducted to explore specific aspects of LDP auditing, including distinguishability attacks on LDP protocols for longitudinal studies and multidimensional data. Finally, we present a notable achievement of our LDP-Auditor framework, which is the discovery of a bug in a state-of-the-art LDP Python package. Overall, our LDP-Auditor framework as well as our study offer valuable insights into the sources of randomness and information loss in LDP protocols. These contributions collectively provide a realistic understanding of the local privacy loss, which can help practitioners in selecting the LDP mechanism and privacy parameters that best align with their specific requirements. We open-sourced LDP-Auditor in \url{https://github.com/hharcolezi/ldp-audit}.

Paper Structure (29 sections, 1 theorem, 8 equations, 12 figures, 1 table, 4 algorithms)

This paper contains 29 sections, 1 theorem, 8 equations, 12 figures, 1 table, 4 algorithms.

Introduction
Our Contributions
Related Work
LDP Frequency Estimation Protocols
Pure $\epsilon$-LDP Protocols
Approximate (${\epsilon, \delta}$)-LDP Protocols
LDP Auditing
LDP-Auditor
LDP-Auditor for Longitudinal Studies
LDP-Auditor for Multidimensional Data
Experimental Evaluation
General Setup of Experiments
Main Auditing Results
Case Study #1: Approximate- VS Pure-LDP
Case Study #2: Auditing the Privacy Loss of Local Hashing Encoding Without LDP
...and 14 more sections

Key Result

Theorem 1

Given black-box access to an LDP mechanism ${\mathcal{M}}$, and a distinguishability attack ${\mathcal{A}}$, for any two distinct values $v_1, v_2$, a number of trials $T$, and a statistical confidence $\alpha$, if LDP-Auditor in Algorithm alg:ldp_auditor_lb returns $\epsilon_{emp}$, then, with prob

Figures (12)

Figure 1: Comparison of estimated privacy loss $\epsilon_{emp}$ with theoretical upper bound $\epsilon=2$ for eight pure LDP frequency estimation protocols. The dashed red line corresponds to the certifiable upper bound. While GRR closely aligns with the theoretical bound, others exhibit empirical $\epsilon_{emp}$ within $\leq 2$x (e.g., SUE) or even $\leq 4$x (i.e., BLH) of the theoretical $\epsilon$ value.
Figure 2: Theoretical $\epsilon$ values (x-axis) versus estimated $\epsilon_{emp}$ values (y-axis) using our LDP-Auditor framework with $\delta=0$. We compare different domain sizes $k$ for eight state-of-the-art $\epsilon$-LDP frequency estimation protocols: GRR kairouz2016discrete, SS wang2016mutualMin2018, SUE rappor, OUE tianhao2017, BLH Bassily2015, OLH tianhao2017, SHE Dwork2006 and THE tianhao2017.
Figure 3: Theoretical $\epsilon$ values (x-axis) versus estimated $\epsilon_{emp}$ values (y-axis) using our LDP-Auditor framework with $\delta=1e^{-5}$. We compare different domain sizes $k$ for six state-of-the-art (${\epsilon, \delta}$)-LDP frequency estimation protocols: AGRR Wang2021_approx_ldp, ASUE Wang2021_approx_ldp, ABLH Wang2021_approx_ldp, AOLH Wang2021_approx_ldp, GM dwork2014algorithmic and AGM balle18a. For GM, we only audit for certifiable theoretical upper bounds $\epsilon \leq 1$.
Figure 4: Theoretical $\epsilon$ values (x-axis) versus estimated $\epsilon_{emp}$ values (y-axis) using our LDP-Auditor framework. We assess different privacy guarantees for six (${\epsilon, \delta}$)-LDP protocols across domain sizes $k \in \{25, 200\}$. The special case $\delta=0$ corresponds to pure $\epsilon$-LDP, for which GM and AGM do not satisfy.
Figure 5: Estimated $\epsilon_{emp}$ (y-axis) versus hash domain $g$ (x-axis) using our LDP-Auditor framework comparing different domain sizes $k$ for LH encoding with no LDP randomization.
...and 7 more figures

Theorems & Definitions (1)

Theorem 1: Correctness of LDP-Auditor

Revealing the True Cost of Locally Differentially Private Protocols: An Auditing Perspective

TL;DR

Abstract

Revealing the True Cost of Locally Differentially Private Protocols: An Auditing Perspective

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (1)