Table of Contents
Fetching ...

Quantifying Classifier Utility under Local Differential Privacy

Ye Zheng, Yidan Hu

TL;DR

This work develops a unified theoretical framework to quantify classifier utility under Local Differential Privacy (LDP) by tying the concentration properties of LDP perturbations to classifier robustness. By expressing utility as the probability that a classifier preserves its output under perturbation, the framework uses a concentration region $B_{\theta}(x)$ with probability $p(\varepsilon,\theta)$ and a robustness analysis to obtain a bound $\rho(\varepsilon,\theta)$ that applies to any combination of LDP mechanism and classifier. It introduces refinement techniques, including robustness hyperrectangles and PAC-LDP relaxations with an extended Gaussian mechanism and a privacy indicator, enabling more accurate utility quantification in higher dimensions. Case studies across stroke prediction, bank attrition, and MNIST-7×7 demonstrate that the theoretical utility closely tracks empirical results in low-dimensional settings and provides practical guidance for selecting LDP mechanisms and privacy parameters, with PM mechanisms often yielding superior utility.

Abstract

Local differential privacy (LDP) offers rigorous, quantifiable privacy guarantees for personal data by introducing perturbations at the data source. Understanding how these perturbations affect classifier utility is crucial for both designers and users. However, a general theoretical framework for quantifying this impact is lacking and also challenging, especially for complex or black-box classifiers. This paper presents a unified framework for theoretically quantifying classifier utility under LDP mechanisms. The key insight is that LDP perturbations are concentrated around the original data with a specific probability, allowing utility analysis to be reframed as robustness analysis within this concentrated region. Our framework thus connects the concentration properties of LDP mechanisms with the robustness of classifiers, treating LDP mechanisms as general distributional functions and classifiers as black boxes. This generality enables applicability to any LDP mechanism and classifier. A direct application of our utility quantification is guiding the selection of LDP mechanisms and privacy parameters for a given classifier. Notably, our analysis shows that piecewise-based mechanisms often yield better utility than alternatives in common scenarios. Beyond the core framework, we introduce two novel refinement techniques that further improve utility quantification. We then present case studies illustrating utility quantification for various combinations of LDP mechanisms and classifiers. Results demonstrate that our theoretical quantification closely matches empirical observations, particularly when classifiers operate in lower-dimensional input spaces.

Quantifying Classifier Utility under Local Differential Privacy

TL;DR

This work develops a unified theoretical framework to quantify classifier utility under Local Differential Privacy (LDP) by tying the concentration properties of LDP perturbations to classifier robustness. By expressing utility as the probability that a classifier preserves its output under perturbation, the framework uses a concentration region with probability and a robustness analysis to obtain a bound that applies to any combination of LDP mechanism and classifier. It introduces refinement techniques, including robustness hyperrectangles and PAC-LDP relaxations with an extended Gaussian mechanism and a privacy indicator, enabling more accurate utility quantification in higher dimensions. Case studies across stroke prediction, bank attrition, and MNIST-7×7 demonstrate that the theoretical utility closely tracks empirical results in low-dimensional settings and provides practical guidance for selecting LDP mechanisms and privacy parameters, with PM mechanisms often yielding superior utility.

Abstract

Local differential privacy (LDP) offers rigorous, quantifiable privacy guarantees for personal data by introducing perturbations at the data source. Understanding how these perturbations affect classifier utility is crucial for both designers and users. However, a general theoretical framework for quantifying this impact is lacking and also challenging, especially for complex or black-box classifiers. This paper presents a unified framework for theoretically quantifying classifier utility under LDP mechanisms. The key insight is that LDP perturbations are concentrated around the original data with a specific probability, allowing utility analysis to be reframed as robustness analysis within this concentrated region. Our framework thus connects the concentration properties of LDP mechanisms with the robustness of classifiers, treating LDP mechanisms as general distributional functions and classifiers as black boxes. This generality enables applicability to any LDP mechanism and classifier. A direct application of our utility quantification is guiding the selection of LDP mechanisms and privacy parameters for a given classifier. Notably, our analysis shows that piecewise-based mechanisms often yield better utility than alternatives in common scenarios. Beyond the core framework, we introduce two novel refinement techniques that further improve utility quantification. We then present case studies illustrating utility quantification for various combinations of LDP mechanisms and classifiers. Results demonstrate that our theoretical quantification closely matches empirical observations, particularly when classifiers operate in lower-dimensional input spaces.

Paper Structure

This paper contains 66 sections, 5 theorems, 56 equations, 16 figures, 2 tables.

Key Result

Proposition 1

Let $\mathcal{M}(x)$ be an $\varepsilon$-LDP mechanism applied to an input $x \in \mathbb{R}$, and let $F_\mathcal{M}$ denote the CDF of the perturbed output $\mathcal{M}(x)$. The concentration property of $\mathcal{M}(x)$ is given by:

Figures (16)

  • Figure 1: From empirical data utility to theoretical data utility. This paper provides a theoretical quantification of classifier utility under LDP mechanisms by connecting the concentration analysis of LDP mechanisms with the robustness analysis of classifiers.
  • Figure 2: Illustration of the example and utility quantification of the classifier under the Laplace mechanism for $x=0.5$.
  • Figure 3: Robustness radius $\theta$ of a $2$D classifier at $x$ and the tested decision boundary by brute force.
  • Figure 4: Comparison of the robustness probability $\rho(\varepsilon, \theta)$ for different LDP mechanisms with $\varepsilon=2, 4, 6$. Details of the mechanisms and discussions on the curves of $\rho(\varepsilon, \theta)$ are provided in Appendix \ref{['appendix:mechanisms']}.
  • Figure 5: Two robustness hyperrectangles (green dashed boxes) for the classifier in Figure \ref{['fig:decision_boundary']}. The original robustness radius $\theta = 0.2$ corresponds to the gray dashed box.
  • ...and 11 more figures

Theorems & Definitions (31)

  • Definition 1: $\varepsilon$-LDP DBLP:journals/corr/DuchiWJ16
  • Definition 2: Classifier
  • Definition 3: Local robustness
  • Definition 4: The Laplace mechanism DBLP:journals/fttcs/DworkR14
  • Proposition 1: Concentration property of LDP
  • Definition 5
  • Definition 6
  • Definition 7
  • Definition 8: Probabilistic robustness robust_analysisDBLP:conf/pkdd/ZhangRF22
  • Theorem 1: Hoeffding bound Hoeffding_inequality
  • ...and 21 more