Table of Contents
Fetching ...

Training with Differential Privacy: A Gradient-Preserving Noise Reduction Approach with Provable Security

Haodi Wang, Tangyu Jiang, Yu Guo, Chengjun Cai, Cong Wang, Xiaohua Jia

TL;DR

GReDP introduces a gradient-preserving DP training method that computes model gradients in the frequency domain via FFT, adds Gaussian noise, and recovers gradients by inverse transforming and retaining only the real part. Theoretical analysis shows the method achieves a noise scale of half that required by DPSGD while preserving the full gradient information, addressing a key utility gap in DP-based DL. Empirical results across MNIST and CIFAR-10 with multiple architectures demonstrate consistent, substantial accuracy gains over DPSGD and Spectral-DP under the same privacy budgets. The approach offers a practical, provably secure DP training mechanism with strong empirical leverage and an open-source implementation for broader adoption.

Abstract

Deep learning models have been extensively adopted in various regions due to their ability to represent hierarchical features, which highly rely on the training set and procedures. Thus, protecting the training process and deep learning algorithms is paramount in privacy preservation. Although Differential Privacy (DP) as a powerful cryptographic primitive has achieved satisfying results in deep learning training, the existing schemes still fall short in preserving model utility, i.e., they either invoke a high noise scale or inevitably harm the original gradients. To address the above issues, in this paper, we present a more robust and provably secure approach for differentially private training called GReDP. Specifically, we compute the model gradients in the frequency domain and adopt a new approach to reduce the noise level. Unlike previous work, our GReDP only requires half of the noise scale compared to DPSGD [1] while keeping all the gradient information intact. We present a detailed analysis of our method both theoretically and empirically. The experimental results show that our GReDP works consistently better than the baselines on all models and training settings.

Training with Differential Privacy: A Gradient-Preserving Noise Reduction Approach with Provable Security

TL;DR

GReDP introduces a gradient-preserving DP training method that computes model gradients in the frequency domain via FFT, adds Gaussian noise, and recovers gradients by inverse transforming and retaining only the real part. Theoretical analysis shows the method achieves a noise scale of half that required by DPSGD while preserving the full gradient information, addressing a key utility gap in DP-based DL. Empirical results across MNIST and CIFAR-10 with multiple architectures demonstrate consistent, substantial accuracy gains over DPSGD and Spectral-DP under the same privacy budgets. The approach offers a practical, provably secure DP training mechanism with strong empirical leverage and an open-source implementation for broader adoption.

Abstract

Deep learning models have been extensively adopted in various regions due to their ability to represent hierarchical features, which highly rely on the training set and procedures. Thus, protecting the training process and deep learning algorithms is paramount in privacy preservation. Although Differential Privacy (DP) as a powerful cryptographic primitive has achieved satisfying results in deep learning training, the existing schemes still fall short in preserving model utility, i.e., they either invoke a high noise scale or inevitably harm the original gradients. To address the above issues, in this paper, we present a more robust and provably secure approach for differentially private training called GReDP. Specifically, we compute the model gradients in the frequency domain and adopt a new approach to reduce the noise level. Unlike previous work, our GReDP only requires half of the noise scale compared to DPSGD [1] while keeping all the gradient information intact. We present a detailed analysis of our method both theoretically and empirically. The experimental results show that our GReDP works consistently better than the baselines on all models and training settings.
Paper Structure (37 sections, 6 theorems, 29 equations, 5 figures, 8 tables, 2 algorithms)

This paper contains 37 sections, 6 theorems, 29 equations, 5 figures, 8 tables, 2 algorithms.

Key Result

Proposition 1

If $f$ is an $(\alpha, \epsilon)$-RDP mechanism, it equivalently satisfies $(\epsilon + \frac{\log 1/\delta}{\alpha-1}, \delta)$-DP for any $\delta \in (0, 1)$.

Figures (5)

  • Figure 1: A typical deep learning procedure and SGD algorithm.
  • Figure 2: The overview of $\mathsf{GReDP}$.
  • Figure 3: Relationship between the testing accuracy and $\epsilon$ (left), and training dynamics when $\epsilon=2$ (middle and right).
  • Figure 4: Relationship between the accuracy and batch size.
  • Figure 5: Relationship between the accuracy and learning rate.

Theorems & Definitions (14)

  • Definition 1
  • Definition 2
  • Definition 3
  • Proposition 1
  • Theorem 1
  • proof
  • Theorem 2
  • Theorem 3
  • Corollary 1
  • Corollary 2
  • ...and 4 more