The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging
Georgi Ganev, Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro
TL;DR
The paper assesses the reproducibility and privacy of six public PATE-GAN implementations by reproducing the original utility benchmarks and applying DP auditing via membership inference attacks. It finds that none replicate the reported utility, while the DP auditing reveals substantial privacy leaks and numerous bugs across implementations. The study documents 19 privacy violations and 5 other bugs, analyzes root causes in data partitioning and accounting, and releases an open-source auditing toolkit to promote robust, auditable DP synthetic-data systems. The work highlights the critical need for thorough reproducibility and vigilant privacy auditing as DP methods move from theory to practice, especially in high-stakes domains.
Abstract
Synthetic data created by differentially private (DP) generative models is increasingly used in real-world settings. In this context, PATE-GAN has emerged as one of the most popular algorithms, combining Generative Adversarial Networks (GANs) with the private training approach of PATE (Private Aggregation of Teacher Ensembles). In this paper, we set out to reproduce the utility evaluation from the original PATE-GAN paper, compare available implementations, and conduct a privacy audit. More precisely, we analyze and benchmark six open-source PATE-GAN implementations, including three by (a subset of) the original authors. First, we shed light on architecture deviations and empirically demonstrate that none reproduce the utility performance reported in the original paper. We then present an in-depth privacy evaluation, which includes DP auditing, and show that all implementations leak more privacy than intended. Furthermore, we uncover 19 privacy violations and 5 other bugs in these six open-source implementations. Lastly, our codebase is available from: https://github.com/spalabucr/pategan-audit.
