Learning with Locally Private Examples by Inverse Weierstrass Private Stochastic Gradient Descent

Jean Dufraiche; Paul Mangold; Michaël Perrot; Marc Tommasi

Learning with Locally Private Examples by Inverse Weierstrass Private Stochastic Gradient Descent

Jean Dufraiche, Paul Mangold, Michaël Perrot, Marc Tommasi

TL;DR

This work addresses bias that arises when learning from locally differentially private, one-shot data releases. It recasts privacy noise as functional transforms, the Gaussian mechanism as a Weierstrass transform and the binary Randomized Response as a Bernoulli transform, and derives their inverses to obtain unbiased loss and gradient estimators. The authors introduce Inverse Weierstrass Private SGD (IWP-SGD), provide unbiasedness proofs and convergence guarantees with an $O(1/n)$ rate, and show how the method specializes to generalized linear models. Empirical results on synthetic and real binary classification tasks demonstrate that IWP-SGD removes LDP bias present in standard SGD on noisy data, while incurring a variance penalty. Overall, the paper offers a principled, task-agnostic framework for debiasing private data and enables effective reuse of privatized data across downstream tasks.

Abstract

Releasing data once and for all under noninteractive Local Differential Privacy (LDP) enables complete data reusability, but the resulting noise may create bias in subsequent analyses. In this work, we leverage the Weierstrass transform to characterize this bias in binary classification. We prove that inverting this transform leads to a bias-correction method to compute unbiased estimates of nonlinear functions on examples released under LDP. We then build a novel stochastic gradient descent algorithm called Inverse Weierstrass Private SGD (IWP-SGD). It converges to the true population risk minimizer at a rate of $\mathcal{O}(1/n)$, with $n$ the number of examples. We empirically validate IWP-SGD on binary classification tasks using synthetic and real-world datasets.

Learning with Locally Private Examples by Inverse Weierstrass Private Stochastic Gradient Descent

TL;DR

rate, and show how the method specializes to generalized linear models. Empirical results on synthetic and real binary classification tasks demonstrate that IWP-SGD removes LDP bias present in standard SGD on noisy data, while incurring a variance penalty. Overall, the paper offers a principled, task-agnostic framework for debiasing private data and enables effective reuse of privatized data across downstream tasks.

Abstract

, with

the number of examples. We empirically validate IWP-SGD on binary classification tasks using synthetic and real-world datasets.

Paper Structure (55 sections, 23 theorems, 125 equations, 8 figures, 1 algorithm)

This paper contains 55 sections, 23 theorems, 125 equations, 8 figures, 1 algorithm.

Introduction
Contributions.
Related Work
Interactive LDP Methods.
Noninteractive and Task-Specific LDP Methods.
Learning with Noisy Data.
Noninteractive and Task-Agnostic LDP.
Privacy Setting and Notations
Notations.
Privacy.
Privacy as a Transform
Weierstrass Transform: a Tool for Gaussian Noise
Bernoulli Transform: a Tool for Binary Label Noise
Combining Weierstrass and Bernoulli Transforms
Bias in Risk Minimization
...and 40 more sections

Key Result

Proposition 2.2

Assume a bounded subset ${\mathcal{X}}\subset{\mathbb R}^p$. By the Gaussian mechanism, the release of with $\sigma^2=8\log(1.25/\delta)\left\|{{\mathcal{X}}}\right\|^2/\epsilon^2$ is $(\epsilon,\delta)$-LDP.

Figures (8)

Figure 1: Comparison of SGD convergence of the exponential loss under $(2,10^{-5})$-LDP for the 2-dimensional synthetic data and $(5,10^{-5})$-LDP for the 10-dimensional synthetic data.
Figure 2: Comparison of SGD convergence of the exponential loss under $(2,10^{-5})$-LDP on ACSPublicCoverage and ACSIncome. Left hand plots show the averaged loss of fitted model $\left(\theta_n\right)$ while right hand plots are showing the loss of the averaged model $\left({\mathbb E}\theta_n\right)$ over random draws.
Figure 3: Approximation of the truncation error $bias_K$ for $K\in\{1,2,3\}$.
Figure 4: Comparison of Accuracy convergence of the model fitted on exp loss under $(2,10^{-5})$-LDP on ACSPublicCoverage and ACSIncome.
Figure 5: Comparison of SGD convergence of the log loss under $(2,10^{-5})$-LDP for the 2-dimensional synthetic data and $(5,10^{-5})$-LDP for the 10-dimensional synthetic data.
...and 3 more figures

Theorems & Definitions (56)

Definition 2.1: Local Differential Privacy (LDP) kasiviswanathan2010learnprivately
Proposition 2.2: Gaussian Mechanism
Proposition 2.3: Randomized Response (RR)
Definition 3.1: Generalized Weierstrass transform
Definition 3.2: Class of Gaussian growing and slowly growing iterated Laplacians function
Theorem 3.3: Series expression of $\WW_{\sigma^2}$
proof : Sketch of proof.
Remark 3.4
Definition 3.5: Bernoulli transform
Theorem 4.2: Bias induced by the Gaussian and Randomized Response mechanisms in binary classification
...and 46 more

Learning with Locally Private Examples by Inverse Weierstrass Private Stochastic Gradient Descent

TL;DR

Abstract

Learning with Locally Private Examples by Inverse Weierstrass Private Stochastic Gradient Descent

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (56)