Table of Contents
Fetching ...

Towards Measuring Membership Privacy

Yunhui Long, Vincent Bindschaedler, Carl A. Gunter

TL;DR

The paper introduces Differential Training Privacy (DTP) as an empirical, per-record privacy risk metric for membership inference in classifiers released via public interfaces. It also defines PDTP, a computable lower bound, and studies how training stability can bound DTP, enabling practical risk assessment when differential privacy is not feasible. Through case studies on NN, NB, and LR across the Adult and Purchase datasets, the authors demonstrate a strong correlation between PDTP and membership attack success, and propose the DTP-1 rule as a publishing guideline. While not a DP guarantee, DTP provides a practical framework for evaluating privacy risks and informs defensive decisions, including potential record-level data curation and further theoretical work on indirect attacks and stability.

Abstract

Machine learning models are increasingly made available to the masses through public query interfaces. Recent academic work has demonstrated that malicious users who can query such models are able to infer sensitive information about records within the training data. Differential privacy can thwart such attacks, but not all models can be readily trained to achieve this guarantee or to achieve it with acceptable utility loss. As a result, if a model is trained without differential privacy guarantee, little is known or can be said about the privacy risk of releasing it. In this work, we investigate and analyze membership attacks to understand why and how they succeed. Based on this understanding, we propose Differential Training Privacy (DTP), an empirical metric to estimate the privacy risk of publishing a classier when methods such as differential privacy cannot be applied. DTP is a measure of a classier with respect to its training dataset, and we show that calculating DTP is efficient in many practical cases. We empirically validate DTP using state-of-the-art machine learning models such as neural networks trained on real-world datasets. Our results show that DTP is highly predictive of the success of membership attacks and therefore reducing DTP also reduces the privacy risk. We advocate for DTP to be used as part of the decision-making process when considering publishing a classifier. To this end, we also suggest adopting the DTP-1 hypothesis: if a classifier has a DTP value above 1, it should not be published.

Towards Measuring Membership Privacy

TL;DR

The paper introduces Differential Training Privacy (DTP) as an empirical, per-record privacy risk metric for membership inference in classifiers released via public interfaces. It also defines PDTP, a computable lower bound, and studies how training stability can bound DTP, enabling practical risk assessment when differential privacy is not feasible. Through case studies on NN, NB, and LR across the Adult and Purchase datasets, the authors demonstrate a strong correlation between PDTP and membership attack success, and propose the DTP-1 rule as a publishing guideline. While not a DP guarantee, DTP provides a practical framework for evaluating privacy risks and informs defensive decisions, including potential record-level data curation and further theoretical work on indirect attacks and stability.

Abstract

Machine learning models are increasingly made available to the masses through public query interfaces. Recent academic work has demonstrated that malicious users who can query such models are able to infer sensitive information about records within the training data. Differential privacy can thwart such attacks, but not all models can be readily trained to achieve this guarantee or to achieve it with acceptable utility loss. As a result, if a model is trained without differential privacy guarantee, little is known or can be said about the privacy risk of releasing it. In this work, we investigate and analyze membership attacks to understand why and how they succeed. Based on this understanding, we propose Differential Training Privacy (DTP), an empirical metric to estimate the privacy risk of publishing a classier when methods such as differential privacy cannot be applied. DTP is a measure of a classier with respect to its training dataset, and we show that calculating DTP is efficient in many practical cases. We empirically validate DTP using state-of-the-art machine learning models such as neural networks trained on real-world datasets. Our results show that DTP is highly predictive of the success of membership attacks and therefore reducing DTP also reduces the privacy risk. We advocate for DTP to be used as part of the decision-making process when considering publishing a classifier. To this end, we also suggest adopting the DTP-1 hypothesis: if a classifier has a DTP value above 1, it should not be published.

Paper Structure

This paper contains 24 sections, 7 theorems, 25 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

Theorem 6.2

If a record $t \in T$ is $\epsilon$-PDTP with classification algorithm $\mathcal{A}$ and dataset $T$, and $\mathcal{A}$ is $\delta$-training stable on $T$, we have $t$ is $\epsilon'$-DTP with $\mathcal{A}$ and $T$, where $\epsilon' = \max(\epsilon, \ln\delta)$.

Figures (8)

  • Figure 1: Membership Attacks on NN-Purchase.
  • Figure 2: Maximum Per-Target Accuracy of Three Membership Attacks on NN-Purchase.
  • Figure 3: Correlation between Average PDTP and Membership Attack Accuracy.
  • Figure 4: Membership Attacks on classifiers learned on purchase dataset
  • Figure 5: Membership Attacks on classifiers learned on adult dataset.
  • ...and 3 more figures

Theorems & Definitions (13)

  • Definition 4.1: Differential Training Privacy
  • Definition 4.2
  • Definition 4.3
  • Definition 4.4
  • Definition 6.1
  • Theorem 6.2
  • Proposition 6.3
  • Proposition 6.4
  • Proposition 6.5
  • Definition 6.6: Linear Statistical Queries classifier roth1999learning
  • ...and 3 more