Table of Contents
Fetching ...

Renyi Differential Privacy in the Shuffle Model: Enhanced Amplification Bounds

E Chen, Yang Cao, Yifei Ge

TL;DR

This work advances Renyi Differential Privacy in the shuffle model by providing the first asymptotically optimal RDP analysis without restricting the local privacy budget $ε_0$, and by linking the shuffle process to a multinomial distance whose exact and asymptotic bounds yield tighter privacy guarantees. The authors introduce a hypothesis-testing framework to derive exact and asymptotic RDP bounds and show the shuffled mechanism closely matches low-loss GDP/regression bounds in the limit. They also present a DP-SGD algorithm built on these RDP insights, with experimental results on MNIST demonstrating improved privacy-utility performance over existing shuffle-based approaches at the same privacy level. Overall, the paper tightens the theoretical understanding of privacy amplification via shuffling and provides practical, scalable guidance for privacy-preserving learning. The combination of a multinomial-based exact bound, an asymptotic normal-approximation bound, and a hypothesis-testing toolkit represents a cohesive advancement with direct impact on private distributed learning workflows.

Abstract

The shuffle model of Differential Privacy (DP) has gained significant attention in privacy-preserving data analysis due to its remarkable tradeoff between privacy and utility. It is characterized by adding a shuffling procedure after each user's locally differentially private perturbation, which leads to a privacy amplification effect, meaning that the privacy guarantee of a small level of noise, say $ε_0$, can be enhanced to $O(ε_0/\sqrt{n})$ (the smaller, the more private) after shuffling all $n$ users' perturbed data. Most studies in the shuffle DP focus on proving a tighter privacy guarantee of privacy amplification. However, the current results assume that the local privacy budget $ε_0$ is within a limited range. In addition, there remains a gap between the tightest lower bound and the known upper bound of the privacy amplification. In this work, we push forward the state-of-the-art by making the following contributions. Firstly, we present the first asymptotically optimal analysis of Renyi Differential Privacy (RDP) in the shuffle model without constraints on $ε_0$. Secondly, we introduce hypothesis testing for privacy amplification through shuffling, offering a distinct analysis technique and a tighter upper bound. Furthermore, we propose a DP-SGD algorithm based on RDP. Experiments demonstrate that our approach outperforms existing methods significantly at the same privacy level.

Renyi Differential Privacy in the Shuffle Model: Enhanced Amplification Bounds

TL;DR

This work advances Renyi Differential Privacy in the shuffle model by providing the first asymptotically optimal RDP analysis without restricting the local privacy budget , and by linking the shuffle process to a multinomial distance whose exact and asymptotic bounds yield tighter privacy guarantees. The authors introduce a hypothesis-testing framework to derive exact and asymptotic RDP bounds and show the shuffled mechanism closely matches low-loss GDP/regression bounds in the limit. They also present a DP-SGD algorithm built on these RDP insights, with experimental results on MNIST demonstrating improved privacy-utility performance over existing shuffle-based approaches at the same privacy level. Overall, the paper tightens the theoretical understanding of privacy amplification via shuffling and provides practical, scalable guidance for privacy-preserving learning. The combination of a multinomial-based exact bound, an asymptotic normal-approximation bound, and a hypothesis-testing toolkit represents a cohesive advancement with direct impact on private distributed learning workflows.

Abstract

The shuffle model of Differential Privacy (DP) has gained significant attention in privacy-preserving data analysis due to its remarkable tradeoff between privacy and utility. It is characterized by adding a shuffling procedure after each user's locally differentially private perturbation, which leads to a privacy amplification effect, meaning that the privacy guarantee of a small level of noise, say , can be enhanced to (the smaller, the more private) after shuffling all users' perturbed data. Most studies in the shuffle DP focus on proving a tighter privacy guarantee of privacy amplification. However, the current results assume that the local privacy budget is within a limited range. In addition, there remains a gap between the tightest lower bound and the known upper bound of the privacy amplification. In this work, we push forward the state-of-the-art by making the following contributions. Firstly, we present the first asymptotically optimal analysis of Renyi Differential Privacy (RDP) in the shuffle model without constraints on . Secondly, we introduce hypothesis testing for privacy amplification through shuffling, offering a distinct analysis technique and a tighter upper bound. Furthermore, we propose a DP-SGD algorithm based on RDP. Experiments demonstrate that our approach outperforms existing methods significantly at the same privacy level.
Paper Structure (12 sections, 15 theorems, 44 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 12 sections, 15 theorems, 44 equations, 4 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

(Feldman et al. feldman2022hiding) For a domain $\mathcal{D}$, let $\mathcal{A}_R$ be the $\epsilon_0$-LDP adaptive process and $\mathcal{A}_{R,S}$ be the related shuffled $\epsilon_0$-LDP adaptive process. Assume $X_0$ = $(x^0_1,x_2,\ldots,x_n)$ and $X_1 = (x^1_1,x_2,\ldots,x_n)$ be two neighbourin Then there exists a randomized postprocessing algorithm $f$ such that $\mathcal{A}_s(X_0)$ is distr

Figures (4)

  • Figure 1: The shuffle model with $\epsilon_0$-LDP users
  • Figure 2: RDP as a function of $\epsilon_0$ for $\lambda=4$ and $n=10^4$
  • Figure 3: RDP as a function of $\lambda$ for $\epsilon_0=2$ and $n=10^4$
  • Figure 4: Comparison of train accuracy at the same privacy level of $(\lambda,0.008\lambda$)-RDP. Non-Private refers to the standard SGD without adding noise.

Theorems & Definitions (27)

  • Definition 1: Pure Differential Privacy
  • Definition 2
  • Definition 3: R$\acute{\text{e}}$ nyi Differential Privacy
  • Proposition 1
  • Theorem 1
  • Corollary 1
  • Lemma 1
  • Theorem 2
  • Corollary 2
  • Proposition 2: Composition theorem of GDP dong2022gaussian
  • ...and 17 more