Table of Contents
Fetching ...

Differentially Private Synthetic Data with Private Density Estimation

Nikolija Bojkovic, Po-Ling Loh

TL;DR

This paper builds upon the work of Boedihardjo et al.

Abstract

The need to analyze sensitive data, such as medical records or financial data, has created a critical research challenge in recent years. In this paper, we adopt the framework of differential privacy, and explore mechanisms for generating an entire dataset which accurately captures characteristics of the original data. We build upon the work of Boedihardjo et al, which laid the foundations for a new optimization-based algorithm for generating private synthetic data. Importantly, we adapt their algorithm by replacing a uniform sampling step with a private distribution estimator; this allows us to obtain better computational guarantees for discrete distributions, and develop a novel algorithm suitable for continuous distributions. We also explore applications of our work to several statistical tasks.

Differentially Private Synthetic Data with Private Density Estimation

TL;DR

This paper builds upon the work of Boedihardjo et al.

Abstract

The need to analyze sensitive data, such as medical records or financial data, has created a critical research challenge in recent years. In this paper, we adopt the framework of differential privacy, and explore mechanisms for generating an entire dataset which accurately captures characteristics of the original data. We build upon the work of Boedihardjo et al, which laid the foundations for a new optimization-based algorithm for generating private synthetic data. Importantly, we adapt their algorithm by replacing a uniform sampling step with a private distribution estimator; this allows us to obtain better computational guarantees for discrete distributions, and develop a novel algorithm suitable for continuous distributions. We also explore applications of our work to several statistical tasks.
Paper Structure (30 sections, 13 theorems, 63 equations, 2 figures, 1 table, 3 algorithms)

This paper contains 30 sections, 13 theorems, 63 equations, 2 figures, 1 table, 3 algorithms.

Key Result

Theorem 2.1

Let $\delta, \gamma > 0$ and set $\sigma = \frac{\delta}{\log(|\mathcal{\mathcal{F}}|/\gamma)}$.

Figures (2)

  • Figure 1: Comparison of Execution Time
  • Figure 2: Comparison of Error

Theorems & Definitions (26)

  • Definition 2.1: Privacy
  • Definition 2.2: Accuracy
  • Theorem 2.1: Theorems 2.2 & 2.3 of Boedihardjo et al. BoeEtal22
  • Remark 1
  • Theorem 3.1
  • Remark 2
  • Corollary 3.1.1
  • Remark 3
  • Theorem 4.1
  • Corollary 4.1.1
  • ...and 16 more