Table of Contents
Fetching ...

OSLO: One-Shot Label-Only Membership Inference Attacks

Yuefeng Peng, Jaechul Roh, Subhransu Maji, Amir Houmansadr

Abstract

We introduce One-Shot Label-Only (OSLO) membership inference attacks (MIAs), which accurately infer a given sample's membership in a target model's training set with high precision using just \emph{a single query}, where the target model only returns the predicted hard label. This is in contrast to state-of-the-art label-only attacks which require $\sim6000$ queries, yet get attack precisions lower than OSLO's. OSLO leverages transfer-based black-box adversarial attacks. The core idea is that a member sample exhibits more resistance to adversarial perturbations than a non-member. We compare OSLO against state-of-the-art label-only attacks and demonstrate that, despite requiring only one query, our method significantly outperforms previous attacks in terms of precision and true positive rate (TPR) under the same false positive rates (FPR). For example, compared to previous label-only MIAs, OSLO achieves a TPR that is at least 7$\times$ higher under a 1\% FPR and at least 22$\times$ higher under a 0.1\% FPR on CIFAR100 for a ResNet18 model. We evaluated multiple defense mechanisms against OSLO.

OSLO: One-Shot Label-Only Membership Inference Attacks

Abstract

We introduce One-Shot Label-Only (OSLO) membership inference attacks (MIAs), which accurately infer a given sample's membership in a target model's training set with high precision using just \emph{a single query}, where the target model only returns the predicted hard label. This is in contrast to state-of-the-art label-only attacks which require queries, yet get attack precisions lower than OSLO's. OSLO leverages transfer-based black-box adversarial attacks. The core idea is that a member sample exhibits more resistance to adversarial perturbations than a non-member. We compare OSLO against state-of-the-art label-only attacks and demonstrate that, despite requiring only one query, our method significantly outperforms previous attacks in terms of precision and true positive rate (TPR) under the same false positive rates (FPR). For example, compared to previous label-only MIAs, OSLO achieves a TPR that is at least 7 higher under a 1\% FPR and at least 22 higher under a 0.1\% FPR on CIFAR100 for a ResNet18 model. We evaluated multiple defense mechanisms against OSLO.
Paper Structure (47 sections, 2 equations, 13 figures, 9 tables, 1 algorithm)

This paper contains 47 sections, 2 equations, 13 figures, 9 tables, 1 algorithm.

Figures (13)

  • Figure 1: An illustration of OSLO versus the state-of-the-art boundary attack. Boundary attack requires querying the target model thousands of times, whereas OSLO requires only a single query.
  • Figure 2: ROC curves for various label-only attacks on three different datasets on ResNet18. Each line represents the TPR of an attack under different FPRs, with an emphasis on the low-FPR regime using a logarithmic scale.
  • Figure 3: ROC curves for various label-only attacks on three different datasets on DenseNet121. Each line represents the TPR of an attack under different FPRs, with an emphasis on the low-FPR regime using a logarithmic scale.
  • Figure 3: Attack TPR and FPR of OSLO without validation models on CIFAR-10 using ResNet18.
  • Figure 4: Precision-Recall curves for various label-only attacks on ResNet18. Each line represents the trade-off between precision and recall for an attack as the attack parameter is varied.
  • ...and 8 more figures