Table of Contents
Fetching ...

PrAda-GAN: A Private Adaptive Generative Adversarial Network with Bayes Network Structure

Ke Jia, Yuheng Ma, Yang Li, Feifei Wang

TL;DR

PrAda-GAN addresses DP tabular data synthesis by uniting GAN training with an adaptive Bayes-network structure, using a sequential autoregressive generator to capture variable dependencies. The method introduces a group-lasso penalty to recover a sparse conditional dependency graph, yielding improved convergence and sparsity-driven generalization under privacy constraints. Theoretical results bound parameter estimation errors and Wasserstein-distance generalization, with sparsity-aware improvements when leveraging the Bayes network. Empirically, PrAda-GAN outperforms baselines in distributional similarity and downstream ML tasks across synthetic and real datasets, particularly under low privacy budgets. This approach offers a scalable, continuous-domain DP solution that automatically adapts to underlying low-dimensional structure while preserving utility.

Abstract

We revisit the problem of generating synthetic data under differential privacy. To address the core limitations of marginal-based methods, we propose the Private Adaptive Generative Adversarial Network with Bayes Network Structure (PrAda-GAN), which integrates the strengths of both GAN-based and marginal-based approaches. Our method adopts a sequential generator architecture to capture complex dependencies among variables, while adaptively regularizing the learned structure to promote sparsity in the underlying Bayes network. Theoretically, we establish diminishing bounds on the parameter distance, variable selection error, and Wasserstein distance. Our analysis shows that leveraging dependency sparsity leads to significant improvements in convergence rates. Empirically, experiments on both synthetic and real-world datasets demonstrate that PrAda-GAN outperforms existing tabular data synthesis methods in terms of the privacy-utility trade-off.

PrAda-GAN: A Private Adaptive Generative Adversarial Network with Bayes Network Structure

TL;DR

PrAda-GAN addresses DP tabular data synthesis by uniting GAN training with an adaptive Bayes-network structure, using a sequential autoregressive generator to capture variable dependencies. The method introduces a group-lasso penalty to recover a sparse conditional dependency graph, yielding improved convergence and sparsity-driven generalization under privacy constraints. Theoretical results bound parameter estimation errors and Wasserstein-distance generalization, with sparsity-aware improvements when leveraging the Bayes network. Empirically, PrAda-GAN outperforms baselines in distributional similarity and downstream ML tasks across synthetic and real datasets, particularly under low privacy budgets. This approach offers a scalable, continuous-domain DP solution that automatically adapts to underlying low-dimensional structure while preserving utility.

Abstract

We revisit the problem of generating synthetic data under differential privacy. To address the core limitations of marginal-based methods, we propose the Private Adaptive Generative Adversarial Network with Bayes Network Structure (PrAda-GAN), which integrates the strengths of both GAN-based and marginal-based approaches. Our method adopts a sequential generator architecture to capture complex dependencies among variables, while adaptively regularizing the learned structure to promote sparsity in the underlying Bayes network. Theoretically, we establish diminishing bounds on the parameter distance, variable selection error, and Wasserstein distance. Our analysis shows that leveraging dependency sparsity leads to significant improvements in convergence rates. Empirically, experiments on both synthetic and real-world datasets demonstrate that PrAda-GAN outperforms existing tabular data synthesis methods in terms of the privacy-utility trade-off.

Paper Structure

This paper contains 41 sections, 4 theorems, 68 equations, 10 figures, 8 tables, 2 algorithms.

Key Result

Theorem 1

Suppose that Assumptions asp:bayesnetwork, asp:private-optimization, asp:finite-moment, and asp:analytic hold. Suppose that $\lambda_j=d^{-1}\psi^{1/2}_{n,d} , j = 1,\ldots,d$, where we define Then, the fitted parameters ${\bm \theta}^{T+1}$ and ${\bm W }^{T} \in {\bm \theta}^{T+1}$ from Algorithm alg:privdag-gan (without comment) satisfy as well as Here, $a>2$ is a positive constant. The expe

Figures (10)

  • Figure 1: Drawback illustration of marginal-based methods.
  • Figure 2: Average WD and TVD. Top: varying discriminator learning rates $d_{\text{lr}}$ across $10^{h}$$g_{\text{lr}} = 10^{-2}$. Bottom: varying generator learning rates $g_{\text{lr}}$ across $10^{h}$ with $d_{\text{lr}} = 10^{-1}$.
  • Figure 3: Average WD and TVD under varying $\lambda$ and $\gamma$.
  • Figure 4: An example Bayes network over continuous variables.
  • Figure 5: Average MMD, JS and TVD(1-way) under nonlinear function $f_j(\Pi_j )$. Top: varying discriminator learning rates $d_{\text{lr}}$ across $10^{h}$ for $h \in \{-3.5, \dots, 1\}$ with $g_{\text{lr}} = 10^{-3}$. Bottom: varying generator learning rates $g_{\text{lr}}$ across $10^{h}$ for $h \in \{-4, -3.5, \dots, -1.5\}$ with $d_{\text{lr}} = 10^{-1}$.
  • ...and 5 more figures

Theorems & Definitions (10)

  • Definition 1: Differential privacy dwork2006calibrating
  • Theorem 1
  • Theorem 2
  • Example 1
  • proof : Proof of Theorem \ref{['thm:feature-selection']}
  • Lemma 1: Exact Selection
  • proof : Proof of Lemma \ref{['lem:exact-selection']}
  • Lemma 2: Reduced Discriminator Complexity
  • proof : Proof of Lemma \ref{['lem:reduced-disco-complexity']}
  • proof : Proof of Theorem \ref{['thm:estimation']}