The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging

Georgi Ganev; Meenatchi Sundaram Muthu Selva Annamalai; Emiliano De Cristofaro

The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging

Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro

TL;DR

The paper assesses the reproducibility and privacy of six public PATE-GAN implementations by reproducing the original utility benchmarks and applying DP auditing via membership inference attacks. It finds that none replicate the reported utility, while the DP auditing reveals substantial privacy leaks and numerous bugs across implementations. The study documents 19 privacy violations and 5 other bugs, analyzes root causes in data partitioning and accounting, and releases an open-source auditing toolkit to promote robust, auditable DP synthetic-data systems. The work highlights the critical need for thorough reproducibility and vigilant privacy auditing as DP methods move from theory to practice, especially in high-stakes domains.

Abstract

Synthetic data created by differentially private (DP) generative models is increasingly used in real-world settings. In this context, PATE-GAN has emerged as one of the most popular algorithms, combining Generative Adversarial Networks (GANs) with the private training approach of PATE (Private Aggregation of Teacher Ensembles). In this paper, we set out to reproduce the utility evaluation from the original PATE-GAN paper, compare available implementations, and conduct a privacy audit. More precisely, we analyze and benchmark six open-source PATE-GAN implementations, including three by (a subset of) the original authors. First, we shed light on architecture deviations and empirically demonstrate that none reproduce the utility performance reported in the original paper. We then present an in-depth privacy evaluation, which includes DP auditing, and show that all implementations leak more privacy than intended. Furthermore, we uncover 19 privacy violations and 5 other bugs in these six open-source implementations. Lastly, our codebase is available from: https://github.com/spalabucr/pategan-audit.

The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging

TL;DR

Abstract

Paper Structure (19 sections, 2 equations, 12 figures, 13 tables, 7 algorithms)

This paper contains 19 sections, 2 equations, 12 figures, 13 tables, 7 algorithms.

Introduction
Preliminaries
Related Work
PATE-GAN Implementations
Utility Benchmark
Privacy Evaluation
PATE-GAN Training
DP Auditing of PATE-GAN
Summary of Privacy Violations
Conclusion
PATE-GAN Algorithm and Differences with PATE
PATE-GAN Algorithm
Differences between PATE-GAN and PATE
Additional Preliminaries
Datasets
...and 4 more sections

Figures (12)

Figure 1: Performance comparison of 12 classifiers (averaged) in Setting B (train on synthetic, test on real) in terms of AUROC with various $\epsilon$ (with $\delta=10^{-5}$) on UCI Epileptic Seizure.
Figure 2: Data records seen by the five teachers-discriminators ($\epsilon=1$) on Kaggle Cervical Cancer.
Figure 3: Cross entropy of the five teachers-discriminators on a fixed subset of data ($\epsilon=1$).
Figure 4: Moments accountant values for 1,000 training iterations.
Figure 5: DP auditing with different black-box MIAs ($\epsilon=1$, as per the dashed red lines).
...and 7 more figures

The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging

TL;DR

Abstract

The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging

Authors

TL;DR

Abstract

Table of Contents

Figures (12)