Table of Contents
Fetching ...

Differential Privacy with Multiple Selections

Ashish Goel, Zhihao Jiang, Aleksandra Korolova, Kamesh Munagala, Sahasrajit Sarmasarkar

TL;DR

This work introduces a multi-selection architecture for differential privacy, where a server returns a small set of candidate results and the user privately chooses the best match. It proves that, on a one-dimensional domain under $\epsilon$-geographic DP, the optimal user action is to add Laplace noise with scale $1/\epsilon$, and the utility loss decreases as $O(1/(\epsilon k))$ when the disutility is the identity. The authors develop a novel infinite-dimensional linear-programming framework, Differential-Integral Linear Programs (DILPs), along with a weak-duality theory and a dual-fitting technique to certify the optimality of Laplace noise and to derive server response strategies. They also extend the analysis to generalized GDP and discuss high-dimensional extensions with PCA, showing that Laplace noise retains optimality under broad conditions. Overall, the paper provides a tight privacy-utility trade-off for multi-selection DP and a rigorous optimization toolkit that could inform practical privacy-preserving retrieval and recommendation systems.

Abstract

We consider the setting where a user with sensitive features wishes to obtain a recommendation from a server in a differentially private fashion. We propose a ``multi-selection'' architecture where the server can send back multiple recommendations and the user chooses one from these that matches best with their private features. When the user feature is one-dimensional -- on an infinite line -- and the accuracy measure is defined w.r.t some increasing function $\mathfrak{h}(.)$ of the distance on the line, we precisely characterize the optimal mechanism that satisfies differential privacy. The specification of the optimal mechanism includes both the distribution of the noise that the user adds to its private value, and the algorithm used by the server to determine the set of results to send back as a response and further show that Laplace is an optimal noise distribution. We further show that this optimal mechanism results in an error that is inversely proportional to the number of results returned when the function $\mathfrak{h}(.)$ is the identity function.

Differential Privacy with Multiple Selections

TL;DR

This work introduces a multi-selection architecture for differential privacy, where a server returns a small set of candidate results and the user privately chooses the best match. It proves that, on a one-dimensional domain under -geographic DP, the optimal user action is to add Laplace noise with scale , and the utility loss decreases as when the disutility is the identity. The authors develop a novel infinite-dimensional linear-programming framework, Differential-Integral Linear Programs (DILPs), along with a weak-duality theory and a dual-fitting technique to certify the optimality of Laplace noise and to derive server response strategies. They also extend the analysis to generalized GDP and discuss high-dimensional extensions with PCA, showing that Laplace noise retains optimality under broad conditions. Overall, the paper provides a tight privacy-utility trade-off for multi-selection DP and a rigorous optimization toolkit that could inform practical privacy-preserving retrieval and recommendation systems.

Abstract

We consider the setting where a user with sensitive features wishes to obtain a recommendation from a server in a differentially private fashion. We propose a ``multi-selection'' architecture where the server can send back multiple recommendations and the user chooses one from these that matches best with their private features. When the user feature is one-dimensional -- on an infinite line -- and the accuracy measure is defined w.r.t some increasing function of the distance on the line, we precisely characterize the optimal mechanism that satisfies differential privacy. The specification of the optimal mechanism includes both the distribution of the noise that the user adds to its private value, and the algorithm used by the server to determine the set of results to send back as a response and further show that Laplace is an optimal noise distribution. We further show that this optimal mechanism results in an error that is inversely proportional to the number of results returned when the function is the identity function.
Paper Structure (57 sections, 32 theorems, 120 equations, 4 figures)

This paper contains 57 sections, 32 theorems, 120 equations, 4 figures.

Key Result

Theorem 1.4

For $\epsilon$-geographic differential privacy, adding Laplace noise, that is, user $u$ sends a signal drawn from distribution $\mathcal{L}_{\epsilon}(u)$, is one of the optimal choices of $\mathcal{P}^{(\epsilon)}_{Z}$ for users. Further, when $\mathfrak{h}(t)=t$, we have $f^{\mathds{1}(.)}(\epsilo

Figures (4)

  • Figure 1: Overall architecture for multi-selection.
  • Figure 2: Optimal mechanisms in geographic differential privacy setting when $k=5$ and $\epsilon\in\{0.3,0.5,1.0\}$. Suppose the user has a private value $u$. Then the user sends a signal $s$ drawn from distribution $\mathcal{L}_{\epsilon}(u)$ to the server, meaning the user sends $s=v+x$ where $x$ is drawn from the density function $\rho(t)$ in this figure. Suppose the server receives $s$. Then the server responds $\{s+a_1,...,s+a_5\}$, where the values of $a_1,a_2,...,a_5$ are the $t$-axis values of dots on the density functions.
  • Figure 3: Solutions for Differential Equation \ref{['eqn:diff_eqn_nu']} for $\textbf{v} = [- \log 4; \text{ }0; \text{ }\log 4]^T$
  • Figure 4: Geographic differential privacy setting when users and results are located on a unit ring, for $k=2$ and $\epsilon\in \{3/8,1\}$, showing the stark difference between Laplace noise and the optimal noise. Suppose the user has a private value $u$. Then the user sends $u+x$ to the server, where $x$ is drawn from a noise distribution with density $\rho(t)$, depicted here for both Laplace noise and the optimal noise. Suppose the server receives $s$. Then the server's optimal response is $s+a_1$ and $s+a_2$, where the values of $a_1,a_2$ are the $t$-axis values of dots on the density functions, again assuming both Laplace noise and the optimal noise. Laplace is not optimal when $\epsilon=3/8$, while Laplace is optimal when $\epsilon=1$.

Theorems & Definitions (70)

  • Definition 1.1: adapted from 6686179koufogiannis2015optimality
  • Definition 1.2: adapted from koufogiannis2015optimality
  • Definition 1.3
  • Example 1
  • Theorem 1.4: corresponds to Theorem \ref{['thm:optimality-laplace-sim']} and Theorem \ref{['cor:geofinal']}
  • Definition 2.1
  • Example 2
  • Theorem 2.2: detailed proof in Appendix \ref{['sec-appendix:simproof']}
  • proof : Proof Sketch
  • Definition 2.3
  • ...and 60 more