PATE-TripleGAN: Privacy-Preserving Image Synthesis with Gaussian Differential Privacy
Zepeng Jiang, Weiwei Ni, Yifan Zhang
TL;DR
Privacy leakage in CGANs motivates the development of privacy-preserving data synthesis. PATE-TripleGAN introduces a three-party min-max framework that combines a classifier with PATE-based gradient desensitization and DPSGD under Gaussian-DP to enable semi-supervised training on unlabeled data. A hybrid gradient desensitization strategy differentiates between generator and classifier gradients to preserve useful information and improve convergence. Experiments on MNIST and Fashion-MNIST show that PATE-TripleGAN yields higher quality labeled data and stronger downstream classifier performance under the same privacy budgets than DPCGAN, especially in low-data or tight privacy regimes.
Abstract
Conditional Generative Adversarial Networks (CGANs) exhibit significant potential in supervised learning model training by virtue of their ability to generate realistic labeled images. However, numerous studies have indicated the privacy leakage risk in CGANs models. The solution DPCGAN, incorporating the differential privacy framework, faces challenges such as heavy reliance on labeled data for model training and potential disruptions to original gradient information due to excessive gradient clipping, making it difficult to ensure model accuracy. To address these challenges, we present a privacy-preserving training framework called PATE-TripleGAN. This framework incorporates a classifier to pre-classify unlabeled data, establishing a three-party min-max game to reduce dependence on labeled data. Furthermore, we present a hybrid gradient desensitization algorithm based on the Private Aggregation of Teacher Ensembles (PATE) framework and Differential Private Stochastic Gradient Descent (DPSGD) method. This algorithm allows the model to retain gradient information more effectively while ensuring privacy protection, thereby enhancing the model's utility. Privacy analysis and extensive experiments affirm that the PATE-TripleGAN model can generate a higher quality labeled image dataset while ensuring the privacy of the training data.
