Actual Knowledge Gain as Privacy Loss in Local Privacy Accounting
Mingen Pan
TL;DR
This work reframes privacy accounting under Local Differential Privacy by introducing realized privacy loss $L(y)$, the maximum possible knowledge gain from an output, and proving its key relationship with the LDP guarantee $L(y)\le e^{\epsilon}$. It establishes an equivalence between LDP and a global learning limit on object-specific statements, then develops the Bayesian Privacy Filter and Privacy Odometer to enable fully adaptive composition while keeping realized loss within the DP bound. To handle continuous-valued queries, it introduces a Difference of Convex (DC) programming framework and a branch-and-bound method to compute realized privacy loss, enabling practical deployment via discretization and translation of regression queries. Empirical results show that Bayesian composition substantially outperforms basic composition for linear and logistic regressions (up to 2–5× efficiency) and provides meaningful privacy budgeting insights in real-world healthcare data. Overall, the approach allows near-optimal utilization of privacy budgets in adaptive querying scenarios and tight upper bounds on leakage via realized privacy loss, bridging LDP with QIF concepts and enabling scalable, privacy-preserving analytics.
Abstract
This paper establishes the equivalence between Local Differential Privacy (LDP) and a global limit on learning any knowledge specific to a queried object. However, an output from an LDP query is not necessarily required to provide exact amount of knowledge equal to the upper bound of the learning limit. The LDP guarantee can overestimate the amount of knowledge gained by an analyst from some outputs. To address this issue, the least upper bound on the actual knowledge gain is derived and referred to as realized privacy loss. This measure is also shown to serve as an upper bound for the actual g-leakage in quantitative information flow. The gap between the LDP guarantee and realized privacy loss motivates the exploration of a more efficient privacy accounting for fully adaptive composition, where an adversary adaptively selects queries based on prior results. The Bayesian Privacy Filter is introduced to continuously accept queries until the realized privacy loss of the composed queries equals the LDP guarantee of the composition, enabling the full utilization of the privacy budget of an object. The realized privacy loss also functions as a privacy odometer for the composed queries, allowing the remaining privacy budget to accurately represent the capacity to accept new queries. Additionally, a branch-and-bound method is devised to compute the realized privacy loss when querying against continuous values. Experimental results indicate that Bayesian Privacy Filter outperforms the basic composition by a factor of one to four when composing linear and logistic regressions.
