Have it your way: Individualized Privacy Assignment for DP-SGD
Franziska Boenisch, Christopher Mühl, Adam Dziedzic, Roy Rinberg, Nicolas Papernot
TL;DR
This work challenges the conventional DP-SGD practice of a single global privacy budget by proposing Individuialized DP-SGD (IDP-SGD), which assigns distinct privacy budgets $\{\varepsilon_p\}$ to data points or groups. It introduces two mechanisms, Sample and Scale, to realize per-group privacy: Sample adjusts data-point sampling rates $\{q_p\}$, while Scale tunes per-group clipping and noise via $c_p$ and $\sigma_p$, ensuring budgets exhaust after a fixed number of iterations. The authors provide theoretical privacy guarantees and practical parameter derivations, and demonstrate via extensive experiments across vision and NLP tasks that IDP-SGD yields substantial utility improvements over standard DP-SGD, with credible privacy protection evidenced by LiRA-based membership inference analyses. They also compare with individualized PATE and show meaningful synergy with individualized privacy accounting, suggesting a path toward finely matched, practically meaningful privacy guarantees in ML training. Overall, IDP-SGD offers a principled framework to align privacy guarantees with individual data-owner preferences while preserving model utility.
Abstract
When training a machine learning model with differential privacy, one sets a privacy budget. This budget represents a maximal privacy violation that any user is willing to face by contributing their data to the training set. We argue that this approach is limited because different users may have different privacy expectations. Thus, setting a uniform privacy budget across all points may be overly conservative for some users or, conversely, not sufficiently protective for others. In this paper, we capture these preferences through individualized privacy budgets. To demonstrate their practicality, we introduce a variant of Differentially Private Stochastic Gradient Descent (DP-SGD) which supports such individualized budgets. DP-SGD is the canonical approach to training models with differential privacy. We modify its data sampling and gradient noising mechanisms to arrive at our approach, which we call Individualized DP-SGD (IDP-SGD). Because IDP-SGD provides privacy guarantees tailored to the preferences of individual users and their data points, we find it empirically improves privacy-utility trade-offs.
