A General Approach to Adding Differential Privacy to Iterative Training Procedures
H. Brendan McMahan, Galen Andrew, Ulfar Erlingsson, Steve Chien, Ilya Mironov, Nicolas Papernot, Peter Kairouz
TL;DR
This work addresses the challenge of adding differential privacy to iterative training on privacy-sensitive data by introducing a modular framework that decouples training, privacy mechanism configuration, and accounting. It generalizes the Gaussian mechanism and the Moments Accountant to handle multiple heterogeneous vector groups with either separate or joint clipping, and introduces a privacy ledger to enable post hoc, robust DP accounting. Key contributions include two vector-group DP mechanisms (separate and joint clipping), a composition method transforming multiple groups into a single Gaussian sum query, hyperparameter and sampling-policy strategies, and an integration path via TensorFlow Privacy. The approach enables practical, flexible, and provable privacy guarantees for complex training procedures, including federated settings, by preserving modularity and allowing reprocessing as tighter DP bounds are discovered.
Abstract
In this work we address the practical challenges of training machine learning models on privacy-sensitive datasets by introducing a modular approach that minimizes changes to training algorithms, provides a variety of configuration strategies for the privacy mechanism, and then isolates and simplifies the critical logic that computes the final privacy guarantees. A key challenge is that training algorithms often require estimating many different quantities (vectors) from the same set of examples --- for example, gradients of different layers in a deep learning architecture, as well as metrics and batch normalization parameters. Each of these may have different properties like dimensionality, magnitude, and tolerance to noise. By extending previous work on the Moments Accountant for the subsampled Gaussian mechanism, we can provide privacy for such heterogeneous sets of vectors, while also structuring the approach to minimize software engineering challenges.
