Deep Learning with Differential Privacy
Martín Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, Li Zhang
TL;DR
This work demonstrates that deep neural networks can be trained with differential privacy at a modest privacy budget, addressing the challenge of non-convex optimization in large models. It introduces DP-SGD, combined with a novel moments accountant, to tightly bound cumulative privacy loss under realistic training regimes, and validates the approach on MNIST and CIFAR-10. Key contributions include a practical DP training pipeline, a tight accounting method that improves over strong composition, and techniques such as differentially private PCA and selective use of pre-trained convolutional layers to maintain model utility. The results establish a concrete privacy-utility trade-off for large-scale deep learning and highlight the practical potential of on-device private learning with manageable computational overhead.
Abstract
Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy. Our implementation and experiments demonstrate that we can train deep neural networks with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.
