Quantifying Classifier Utility under Local Differential Privacy
Ye Zheng, Yidan Hu
TL;DR
This work develops a unified theoretical framework to quantify classifier utility under Local Differential Privacy (LDP) by tying the concentration properties of LDP perturbations to classifier robustness. By expressing utility as the probability that a classifier preserves its output under perturbation, the framework uses a concentration region $B_{\theta}(x)$ with probability $p(\varepsilon,\theta)$ and a robustness analysis to obtain a bound $\rho(\varepsilon,\theta)$ that applies to any combination of LDP mechanism and classifier. It introduces refinement techniques, including robustness hyperrectangles and PAC-LDP relaxations with an extended Gaussian mechanism and a privacy indicator, enabling more accurate utility quantification in higher dimensions. Case studies across stroke prediction, bank attrition, and MNIST-7×7 demonstrate that the theoretical utility closely tracks empirical results in low-dimensional settings and provides practical guidance for selecting LDP mechanisms and privacy parameters, with PM mechanisms often yielding superior utility.
Abstract
Local differential privacy (LDP) offers rigorous, quantifiable privacy guarantees for personal data by introducing perturbations at the data source. Understanding how these perturbations affect classifier utility is crucial for both designers and users. However, a general theoretical framework for quantifying this impact is lacking and also challenging, especially for complex or black-box classifiers. This paper presents a unified framework for theoretically quantifying classifier utility under LDP mechanisms. The key insight is that LDP perturbations are concentrated around the original data with a specific probability, allowing utility analysis to be reframed as robustness analysis within this concentrated region. Our framework thus connects the concentration properties of LDP mechanisms with the robustness of classifiers, treating LDP mechanisms as general distributional functions and classifiers as black boxes. This generality enables applicability to any LDP mechanism and classifier. A direct application of our utility quantification is guiding the selection of LDP mechanisms and privacy parameters for a given classifier. Notably, our analysis shows that piecewise-based mechanisms often yield better utility than alternatives in common scenarios. Beyond the core framework, we introduce two novel refinement techniques that further improve utility quantification. We then present case studies illustrating utility quantification for various combinations of LDP mechanisms and classifiers. Results demonstrate that our theoretical quantification closely matches empirical observations, particularly when classifiers operate in lower-dimensional input spaces.
