Actual Knowledge Gain as Privacy Loss in Local Privacy Accounting

Mingen Pan

Actual Knowledge Gain as Privacy Loss in Local Privacy Accounting

Mingen Pan

TL;DR

This work reframes privacy accounting under Local Differential Privacy by introducing realized privacy loss $L(y)$, the maximum possible knowledge gain from an output, and proving its key relationship with the LDP guarantee $L(y)\le e^{\epsilon}$. It establishes an equivalence between LDP and a global learning limit on object-specific statements, then develops the Bayesian Privacy Filter and Privacy Odometer to enable fully adaptive composition while keeping realized loss within the DP bound. To handle continuous-valued queries, it introduces a Difference of Convex (DC) programming framework and a branch-and-bound method to compute realized privacy loss, enabling practical deployment via discretization and translation of regression queries. Empirical results show that Bayesian composition substantially outperforms basic composition for linear and logistic regressions (up to 2–5× efficiency) and provides meaningful privacy budgeting insights in real-world healthcare data. Overall, the approach allows near-optimal utilization of privacy budgets in adaptive querying scenarios and tight upper bounds on leakage via realized privacy loss, bridging LDP with QIF concepts and enabling scalable, privacy-preserving analytics.

Abstract

This paper establishes the equivalence between Local Differential Privacy (LDP) and a global limit on learning any knowledge specific to a queried object. However, an output from an LDP query is not necessarily required to provide exact amount of knowledge equal to the upper bound of the learning limit. The LDP guarantee can overestimate the amount of knowledge gained by an analyst from some outputs. To address this issue, the least upper bound on the actual knowledge gain is derived and referred to as realized privacy loss. This measure is also shown to serve as an upper bound for the actual g-leakage in quantitative information flow. The gap between the LDP guarantee and realized privacy loss motivates the exploration of a more efficient privacy accounting for fully adaptive composition, where an adversary adaptively selects queries based on prior results. The Bayesian Privacy Filter is introduced to continuously accept queries until the realized privacy loss of the composed queries equals the LDP guarantee of the composition, enabling the full utilization of the privacy budget of an object. The realized privacy loss also functions as a privacy odometer for the composed queries, allowing the remaining privacy budget to accurately represent the capacity to accept new queries. Additionally, a branch-and-bound method is devised to compute the realized privacy loss when querying against continuous values. Experimental results indicate that Bayesian Privacy Filter outperforms the basic composition by a factor of one to four when composing linear and logistic regressions.

Actual Knowledge Gain as Privacy Loss in Local Privacy Accounting

TL;DR

This work reframes privacy accounting under Local Differential Privacy by introducing realized privacy loss

, the maximum possible knowledge gain from an output, and proving its key relationship with the LDP guarantee

. It establishes an equivalence between LDP and a global learning limit on object-specific statements, then develops the Bayesian Privacy Filter and Privacy Odometer to enable fully adaptive composition while keeping realized loss within the DP bound. To handle continuous-valued queries, it introduces a Difference of Convex (DC) programming framework and a branch-and-bound method to compute realized privacy loss, enabling practical deployment via discretization and translation of regression queries. Empirical results show that Bayesian composition substantially outperforms basic composition for linear and logistic regressions (up to 2–5× efficiency) and provides meaningful privacy budgeting insights in real-world healthcare data. Overall, the approach allows near-optimal utilization of privacy budgets in adaptive querying scenarios and tight upper bounds on leakage via realized privacy loss, bridging LDP with QIF concepts and enabling scalable, privacy-preserving analytics.

Abstract

Paper Structure (43 sections, 11 theorems, 63 equations, 1 figure, 1 table, 6 algorithms)

This paper contains 43 sections, 11 theorems, 63 equations, 1 figure, 1 table, 6 algorithms.

Introduction
Motivation
Result
Related Work
Background
Belief and Bayesian Inference
Probabilistic Knowledge
Quantitative Information Flow (QIF)
Local Differential Privacy (LDP)
LDP of Continuous Value
Learning from LDP queries
Learning Limit
Equivalence between LDP and Learning Limit
What is Privacy Loss?
Relationship with QIF
...and 28 more sections

Key Result

Lemma 3.1

For any statement $f$ specific to $X$, prior belief $Q$, and output $y$ from a query $M(X)$, the inequality holds trueHereinafter, if $\min_{x' \in \mathcal{X}} Pr(M(x') = y) = 0$, the right-hand side of Eq. eq:bound_of_Q_f is assumed to be $+\infty$..

Figures (1)

Figure 1: Results for Composition Experiments. (a) realized privacy loss vs. number of accepted queries for 10 sampled linear regressions; (b) confidential interval (10th to 90th percentile, same for (d)) of number of accepted queries vs. privacy budget for linear regressions; (c) realized privacy loss vs. number of accepted queries for 10 sampled logistic regressions; and (d) confidential interval of number of accepted queries vs. privacy budget for logistic regressions.

Theorems & Definitions (14)

Example 1
Lemma 3.1
Theorem 3.1
Lemma 3.2
Theorem 3.2
Example 2
Theorem 3.3
Corollary 3.1
Theorem 4.1
Corollary 4.1
...and 4 more

Actual Knowledge Gain as Privacy Loss in Local Privacy Accounting

TL;DR

Abstract

Actual Knowledge Gain as Privacy Loss in Local Privacy Accounting

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (14)