Table of Contents
Fetching ...

Differential Privacy Mechanisms in Neural Tangent Kernel Regression

Jiuxiang Gu, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song

TL;DR

This work can show provable guarantees for both differential privacy and test accuracy of the NTK regression setting, and is the first work to provide a DP guarantee for NTK regression, to under-stand how privacy mechanisms work in AI applications.

Abstract

Training data privacy is a fundamental problem in modern Artificial Intelligence (AI) applications, such as face recognition, recommendation systems, language generation, and many others, as it may contain sensitive user information related to legal issues. To fundamentally understand how privacy mechanisms work in AI applications, we study differential privacy (DP) in the Neural Tangent Kernel (NTK) regression setting, where DP is one of the most powerful tools for measuring privacy under statistical learning, and NTK is one of the most popular analysis frameworks for studying the learning mechanisms of deep neural networks. In our work, we can show provable guarantees for both differential privacy and test accuracy of our NTK regression. Furthermore, we conduct experiments on the basic image classification dataset CIFAR10 to demonstrate that NTK regression can preserve good accuracy under a modest privacy budget, supporting the validity of our analysis. To our knowledge, this is the first work to provide a DP guarantee for NTK regression.

Differential Privacy Mechanisms in Neural Tangent Kernel Regression

TL;DR

This work can show provable guarantees for both differential privacy and test accuracy of the NTK regression setting, and is the first work to provide a DP guarantee for NTK regression, to under-stand how privacy mechanisms work in AI applications.

Abstract

Training data privacy is a fundamental problem in modern Artificial Intelligence (AI) applications, such as face recognition, recommendation systems, language generation, and many others, as it may contain sensitive user information related to legal issues. To fundamentally understand how privacy mechanisms work in AI applications, we study differential privacy (DP) in the Neural Tangent Kernel (NTK) regression setting, where DP is one of the most powerful tools for measuring privacy under statistical learning, and NTK is one of the most popular analysis frameworks for studying the learning mechanisms of deep neural networks. In our work, we can show provable guarantees for both differential privacy and test accuracy of our NTK regression. Furthermore, we conduct experiments on the basic image classification dataset CIFAR10 to demonstrate that NTK regression can preserve good accuracy under a modest privacy budget, supporting the validity of our analysis. To our knowledge, this is the first work to provide a DP guarantee for NTK regression.
Paper Structure (43 sections, 41 theorems, 165 equations, 1 figure, 2 algorithms)

This paper contains 43 sections, 41 theorems, 165 equations, 1 figure, 2 algorithms.

Key Result

Theorem 1.1

Under proper conditions, for any test data $x$, we have NTK-regression is $(\epsilon,\delta)$-DP and has good utility under a large probability.

Figures (1)

  • Figure 1: The trade-off between the accuracy parameter and privacy parameter. We conduct experiments on different privacy budget $\epsilon$, where we fixed the $\delta = 2 \times 10^{-3}$, and we assume that $\beta = 10^{-6}$ in our experiments. The x-axis denotes the $\log (\epsilon_{\mathrm{dp}})$, where the $\log$ denotes $\log_{10}$. The y-axis denotes the binary classification accuracy. As privacy budget $\epsilon_{\mathrm{dp}}$ increase, both private test acc and private train acc approach to non-private train acc and non-private test acc, respectively.

Theorems & Definitions (74)

  • Theorem 1.1: Main result, informal version of Theorem \ref{['thm:main']}
  • Definition 3.1: Differential Privacy, dr14
  • Lemma 3.2: Truncated Laplace Mechanism, dr14gdgk20aimn23
  • Lemma 3.3: Post-Processing Lemma for DP, dr14
  • Lemma 3.4: Composition lemma for DP, dr14
  • Definition 3.5: Discrete Quadratic NTK Kernel
  • Definition 3.6: Continuous Quadratic NTK Kernel
  • Definition 3.7: Classical kernel ridge regression lss+20
  • Definition 3.8: NTK Regression lss+20
  • Definition 4.1: $\beta$-close neighbor dataset, gsy23
  • ...and 64 more