Table of Contents
Fetching ...

Forward Learning with Differential Privacy

Mingqian Feng, Zeliang Zhang, Jinyang Jiang, Yijie Peng, Chenliang Xu

TL;DR

This work targets differential privacy in deep learning by exploiting forward-learning perturbations rather than backpropagation-based gradient noise. It introduces DP-ULR, a privatized forward-learning algorithm that combines a sampling-with-rejection strategy, likelihood-ratio gradient proxies, and a dynamic privacy controller to bound DP costs. The authors provide a theoretical DP analysis using SRGM and $(\alpha,\gamma)$-RDP, demonstrating that the privacy impact of rejection sampling is negligible and that DP-ULR can achieve DP guarantees with competitive utility relative to DP-SGD. Empirically, DP-ULR is shown to perform well on MNIST with MLPs and CIFAR-10 with CNNs, particularly at larger batch sizes, while offering advantages in parallelization and applicability to non-differentiable or black-box components.

Abstract

Differential privacy (DP) in deep learning is a critical concern as it ensures the confidentiality of training data while maintaining model utility. Existing DP training algorithms provide privacy guarantees by clipping and then injecting external noise into sample gradients computed by the backpropagation algorithm. Different from backpropagation, forward-learning algorithms based on perturbation inherently add noise during the forward pass and utilize randomness to estimate the gradients. Although these algorithms are non-privatized, the introduction of noise during the forward pass indirectly provides internal randomness protection to the model parameters and their gradients, suggesting the potential for naturally providing differential privacy. In this paper, we propose a \blue{privatized} forward-learning algorithm, Differential Private Unified Likelihood Ratio (DP-ULR), and demonstrate its differential privacy guarantees. DP-ULR features a novel batch sampling operation with rejection, of which we provide theoretical analysis in conjunction with classic differential privacy mechanisms. DP-ULR is also underpinned by a theoretically guided privacy controller that dynamically adjusts noise levels to manage privacy costs in each training step. Our experiments indicate that DP-ULR achieves competitive performance compared to traditional differential privacy training algorithms based on backpropagation, maintaining nearly the same privacy loss limits.

Forward Learning with Differential Privacy

TL;DR

This work targets differential privacy in deep learning by exploiting forward-learning perturbations rather than backpropagation-based gradient noise. It introduces DP-ULR, a privatized forward-learning algorithm that combines a sampling-with-rejection strategy, likelihood-ratio gradient proxies, and a dynamic privacy controller to bound DP costs. The authors provide a theoretical DP analysis using SRGM and -RDP, demonstrating that the privacy impact of rejection sampling is negligible and that DP-ULR can achieve DP guarantees with competitive utility relative to DP-SGD. Empirically, DP-ULR is shown to perform well on MNIST with MLPs and CIFAR-10 with CNNs, particularly at larger batch sizes, while offering advantages in parallelization and applicability to non-differentiable or black-box components.

Abstract

Differential privacy (DP) in deep learning is a critical concern as it ensures the confidentiality of training data while maintaining model utility. Existing DP training algorithms provide privacy guarantees by clipping and then injecting external noise into sample gradients computed by the backpropagation algorithm. Different from backpropagation, forward-learning algorithms based on perturbation inherently add noise during the forward pass and utilize randomness to estimate the gradients. Although these algorithms are non-privatized, the introduction of noise during the forward pass indirectly provides internal randomness protection to the model parameters and their gradients, suggesting the potential for naturally providing differential privacy. In this paper, we propose a \blue{privatized} forward-learning algorithm, Differential Private Unified Likelihood Ratio (DP-ULR), and demonstrate its differential privacy guarantees. DP-ULR features a novel batch sampling operation with rejection, of which we provide theoretical analysis in conjunction with classic differential privacy mechanisms. DP-ULR is also underpinned by a theoretically guided privacy controller that dynamically adjusts noise levels to manage privacy costs in each training step. Our experiments indicate that DP-ULR achieves competitive performance compared to traditional differential privacy training algorithms based on backpropagation, maintaining nearly the same privacy loss limits.

Paper Structure

This paper contains 25 sections, 7 theorems, 42 equations, 4 figures, 2 tables.

Key Result

Proposition 2.4

If $f$ is an $(\alpha, \gamma)$-RDP mechanism, it also satisfies $(\gamma+\frac{\ln 1/\delta}{\alpha-1},\delta)$-differential privacy for any $0<\delta<1$, or equivalently $(\epsilon,\exp{[(\alpha-1)(\gamma-\epsilon)]})$-differential privacy for any $\epsilon>\gamma$.

Figures (4)

  • Figure 1: Compared to traditional training algorithms, forward learning adds noise during the forward pass and estimates naturally randomized gradients, leading to a potential free lunch of differential privacy.
  • Figure 2: Contour plots of the ratio of the first term to the second term in Equation (\ref{['eq:dp_cost']}).
  • Figure 3: Optimization dynamics of the MLP training with differential privacy using DP-SGD and our proposed DP-ULR and corresponding $\epsilon$ with $\delta=10^{-5}$.
  • Figure 4: Evaluation results of the CNN training with differential privacy using DP-SGD in (a) and our proposed DP-ULR in (b)--(d).

Theorems & Definitions (13)

  • Definition 2.1: $(\epsilon, \delta)$-DP ODO
  • Definition 2.2: Rényi divergence RDPmironov2019renyi
  • Definition 2.3: Rényi differential privacy (RDP) RDPmironov2019renyi
  • Proposition 2.4: From $(\alpha, \gamma)$-RDP to $(\epsilon, \delta)$-DP RDP
  • Proposition 3.1
  • Proposition 3.2
  • Definition 3.4: Sampled with Rejection Gaussian Mechanism (SRGM)
  • Theorem 3.5
  • Theorem 3.6
  • Theorem 1.1
  • ...and 3 more