Differentially Private Learning Needs Better Features (or Much More Data)
Florian Tramèr, Dan Boneh
TL;DR
This paper investigates differential privacy in vision tasks and finds that handcrafted priors, via ScatterNet features, yield substantially better privacy-utility than end-to-end private CNNs at moderate privacy budgets. It shows that linear models on ScatterNet features often outperform private deep models, and that deeper private learning helps but remains limited without stronger priors. The authors demonstrate two practical routes to reduce the DP-utility gap: collecting more private data or leveraging public data through transfer learning to obtain better features. They provide strong baselines, analyze convergence dynamics, and outline open problems, including faster convergence methods and federated DP, to guide future progress in private deep learning.
Abstract
We demonstrate that differentially private machine learning has not yet reached its "AlexNet moment" on many canonical vision tasks: linear models trained on handcrafted features significantly outperform end-to-end deep neural networks for moderate privacy budgets. To exceed the performance of handcrafted features, we show that private learning requires either much more private data, or access to features learned on public data from a similar domain. Our work introduces simple yet strong baselines for differentially private learning that can inform the evaluation of future progress in this area.
