Personalized Differential Privacy for Ridge Regression
Krishna Acharya, Franziska Boenisch, Rakshit Naidu, Juba Ziani
TL;DR
This work addresses the limitation of uniform privacy budgets in differential privacy by introducing Personalized-DP Output Perturbation (PDP-OP) for ridge regression, enabling per-data-point privacy levels via weights and a controlled noise mechanism. It provides formal privacy proofs and new accuracy guarantees tailored to personalized DP, situating the method within private ERM and extending prior empirical work with theoretical assurances. Empirically, PDP-OP significantly improves the privacy-utility trade-off over standard DP and over Jorgensen’s personalized approach on both synthetic and real data, with lower loss and reduced variability. The approach enables finer-grained privacy control with practical impact for privacy-sensitive ML tasks.
Abstract
The increased application of machine learning (ML) in sensitive domains requires protecting the training data through privacy frameworks, such as differential privacy (DP). DP requires to specify a uniform privacy level $\varepsilon$ that expresses the maximum privacy loss that each data point in the entire dataset is willing to tolerate. Yet, in practice, different data points often have different privacy requirements. Having to set one uniform privacy level is usually too restrictive, often forcing a learner to guarantee the stringent privacy requirement, at a large cost to accuracy. To overcome this limitation, we introduce our novel Personalized-DP Output Perturbation method (PDP-OP) that enables to train Ridge regression models with individual per data point privacy levels. We provide rigorous privacy proofs for our PDP-OP as well as accuracy guarantees for the resulting model. This work is the first to provide such theoretical accuracy guarantees when it comes to personalized DP in machine learning, whereas previous work only provided empirical evaluations. We empirically evaluate PDP-OP on synthetic and real datasets and with diverse privacy distributions. We show that by enabling each data point to specify their own privacy requirement, we can significantly improve the privacy-accuracy trade-offs in DP. We also show that PDP-OP outperforms the personalized privacy techniques of Jorgensen et al. (2015).
