Enhanced Privacy Bound for Shuffle Model with Personalized Privacy

Yixuan Liu; Yuhan Liu; Li Xiong; Yujie Gu; Hong Chen

Enhanced Privacy Bound for Shuffle Model with Personalized Privacy

Yixuan Liu, Yuhan Liu, Li Xiong, Yujie Gu, Hong Chen

TL;DR

This work tightens privacy guarantees for the shuffle model under personalized local DP by introducing a hypothesis-testing-based method to compute the confounding probability $p$ and applying $f$-DP to bound the central privacy amplification. By precisely modeling clone-generation and leveraging the convexity of the resulting distributions, the authors derive a general, tighter bound for arbitrary DP mechanisms, with worst-case analysis incorporating heterogeneous budgets. Numerical results show the bound outperforms state-of-the-art results in both pure- and approximate-PLDP, demonstrating stronger amplification as the number of users grows and across diverse privacy settings. The approach provides a practical, general framework for quantifying privacy amplification in heterogeneous privacy regimes, with significant implications for real-world distributed data collection systems.

Abstract

The shuffle model of Differential Privacy (DP) is an enhanced privacy protocol which introduces an intermediate trusted server between local users and a central data curator. It significantly amplifies the central DP guarantee by anonymizing and shuffling the local randomized data. Yet, deriving a tight privacy bound is challenging due to its complicated randomization protocol. While most existing work are focused on unified local privacy settings, this work focuses on deriving the central privacy bound for a more practical setting where personalized local privacy is required by each user. To bound the privacy after shuffling, we first need to capture the probability of each user generating clones of the neighboring data points. Second, we need to quantify the indistinguishability between two distributions of the number of clones on neighboring datasets. Existing works either inaccurately capture the probability, or underestimate the indistinguishability between neighboring datasets. Motivated by this, we develop a more precise analysis, which yields a general and tighter bound for arbitrary DP mechanisms. Firstly, we derive the clone-generating probability by hypothesis testing %from a randomizer-specific perspective, which leads to a more accurate characterization of the probability. Secondly, we analyze the indistinguishability in the context of $f$-DP, where the convexity of the distributions is leveraged to achieve a tighter privacy bound. Theoretical and numerical results demonstrate that our bound remarkably outperforms the existing results in the literature.

Enhanced Privacy Bound for Shuffle Model with Personalized Privacy

TL;DR

This work tightens privacy guarantees for the shuffle model under personalized local DP by introducing a hypothesis-testing-based method to compute the confounding probability

and applying

-DP to bound the central privacy amplification. By precisely modeling clone-generation and leveraging the convexity of the resulting distributions, the authors derive a general, tighter bound for arbitrary DP mechanisms, with worst-case analysis incorporating heterogeneous budgets. Numerical results show the bound outperforms state-of-the-art results in both pure- and approximate-PLDP, demonstrating stronger amplification as the number of users grows and across diverse privacy settings. The approach provides a practical, general framework for quantifying privacy amplification in heterogeneous privacy regimes, with significant implications for real-world distributed data collection systems.

Abstract

-DP, where the convexity of the distributions is leveraged to achieve a tighter privacy bound. Theoretical and numerical results demonstrate that our bound remarkably outperforms the existing results in the literature.

Paper Structure (13 sections, 2 theorems, 12 equations, 4 figures, 2 tables)

This paper contains 13 sections, 2 theorems, 12 equations, 4 figures, 2 tables.

Introduction
Preliminaries
Central and Local Differential Privacy
Shuffle-based Privacy
Privacy Analysis
Confounding Effect $p$
Quantifying $p$ with Hypothesis Testing
Hypothesis Testing on Neighboring Data Point $x_1^b$
Hypothesis Testing on Rest Data Points $x_i$
Privacy Amplification with $f$-DP
Experiment Results
Conclusion
Acknowledgement

Key Result

Theorem 1

The trade-off function of shuffling process is defined as $f_s(\alpha(t))$, for $t\geq 0$, each $\alpha(t) = \sum_{i=0}^{n-1} w_i^0 F_i(i-\frac{i+1}{t+1}) \in [0,1]$. The function $f_s$ at $\alpha(t)$ is where $F_i$ is the abbreviation of $F_i(i+1-\frac{i+1}{t+1})$.

Figures (4)

Figure 1: Procedure of shuffle model with personalized privacy. Each user data $x_i$ is randomized locally. Privacy parameters $(\epsilon_i, \delta_i)$ and perturbed $\tilde{x}_i$ are shuffled. Analyzer aggregates $\tilde{x}_i$ for further statistics or model training.
Figure 2: Green area represents $\Pr[R(x_i) \in U_0]$, output of $R(x_i)$ is wrongly recognized from $x_1^0$; Yellow area represents $\Pr[R(x_i) \in U_1]$, output of $R(x_i)$ is mistaken from $x_1^1$.
Figure 3: Confounding effect $p$ under personalized privacy.
Figure 4: Privacy bounds with various number of data points and privacy parameters, for pure- and approximate-PLDP.

Theorems & Definitions (4)

Definition 1: Differential Privacy
Definition 2: Local Differential Privacy
Theorem 1: Trade-off function
Theorem 2: Enhanced Privacy Bound

Enhanced Privacy Bound for Shuffle Model with Personalized Privacy

TL;DR

Abstract

Enhanced Privacy Bound for Shuffle Model with Personalized Privacy

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (4)