Table of Contents
Fetching ...

Defending Membership Inference Attacks via Privacy-aware Sparsity Tuning

Qiang Hu, Hengxiang Zhang, Hongxin Wei

TL;DR

The key idea behind PAST is to promote sparsity in parameters that significantly contribute to privacy leakage, by employing adaptive penalties to different parameters by constructing the adaptive weight for each parameter based on its privacy sensitivity.

Abstract

Over-parameterized models are typically vulnerable to membership inference attacks, which aim to determine whether a specific sample is included in the training of a given model. Previous Weight regularizations (e.g., L1 regularization) typically impose uniform penalties on all parameters, leading to a suboptimal tradeoff between model utility and privacy. In this work, we first show that only a small fraction of parameters substantially impact the privacy risk. In light of this, we propose Privacy-aware Sparsity Tuning (PAST), a simple fix to the L1 Regularization, by employing adaptive penalties to different parameters. Our key idea behind PAST is to promote sparsity in parameters that significantly contribute to privacy leakage. In particular, we construct the adaptive weight for each parameter based on its privacy sensitivity, i.e., the gradient of the loss gap with respect to the parameter. Using PAST, the network shrinks the loss gap between members and non-members, leading to strong resistance to privacy attacks. Extensive experiments demonstrate the superiority of PAST, achieving a state-of-the-art balance in the privacy-utility trade-off.

Defending Membership Inference Attacks via Privacy-aware Sparsity Tuning

TL;DR

The key idea behind PAST is to promote sparsity in parameters that significantly contribute to privacy leakage, by employing adaptive penalties to different parameters by constructing the adaptive weight for each parameter based on its privacy sensitivity.

Abstract

Over-parameterized models are typically vulnerable to membership inference attacks, which aim to determine whether a specific sample is included in the training of a given model. Previous Weight regularizations (e.g., L1 regularization) typically impose uniform penalties on all parameters, leading to a suboptimal tradeoff between model utility and privacy. In this work, we first show that only a small fraction of parameters substantially impact the privacy risk. In light of this, we propose Privacy-aware Sparsity Tuning (PAST), a simple fix to the L1 Regularization, by employing adaptive penalties to different parameters. Our key idea behind PAST is to promote sparsity in parameters that significantly contribute to privacy leakage. In particular, we construct the adaptive weight for each parameter based on its privacy sensitivity, i.e., the gradient of the loss gap with respect to the parameter. Using PAST, the network shrinks the loss gap between members and non-members, leading to strong resistance to privacy attacks. Extensive experiments demonstrate the superiority of PAST, achieving a state-of-the-art balance in the privacy-utility trade-off.

Paper Structure

This paper contains 39 sections, 7 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: (a) Loss gaps and attack advantage during standard training. The attack advantage increases synchronously with the loss gap during the training process, showing the privacy leakage of over-parameterization, and thus we consider the loss gap as a proxy; (b) The privacy sensitivity distribution across parameters. Only a small fraction of parameters substantially impacts the privacy risk. (The cumulative sensitivity in the top 20% parameters exceeds 89.27% of the total.)
  • Figure 2: (a) Weight distribution before (Base) and after regularization (Ours). Weights of Ours is clearly more concentrated around 0 and thus is sparser compared to the base; (b) Gini index (criterion for sparsity) during the regularization process. The Gini index continues decreasing during tuning, which also demonstrates the sparsity effect of PAST; (c) Loss gap throughout the whole training process. The regularization (beginning at epoch 100) quickly reduces the loss gap, leading to strong resistance to privacy attacks.
  • Figure 3: Comparisons of five defense mechanisms on CIFAR-10 dataset utilizing Resnet18 architecture. Each subplot is allocated to a distinct attack method, wherein individual curves represent the performance of a defense mechanism under different hyperparameter settings. The horizontal axis represents the target models' test accuracy (the higher the better), and the vertical axis represents the corresponding attack advantage (defined in Definition \ref{['def: adv']}, the lower the better). To underscore the disparity between the defense methods and the vanilla (undefended model), we plot the dotted line originating from the vanilla results.
  • Figure 4: Comparisons of seven defense mechanisms on CIFAR-100 dataset utilizing Densenet121 architecture. Each subplot is allocated to a distinct attack method, wherein individual curves represent the performance of a defense mechanism under different hyperparameter settings. The horizontal axis represents the target models' test accuracy (the higher the better), and the vertical axis represents the corresponding attack advantage (defined in Definition \ref{['def: adv']}, the lower the better). To underscore the disparity between the defense methods and the vanilla (undefended model), we plot the dotted line originating from the vanilla results.
  • Figure 5: (a) Utility-privacy trade-offs for fixed/ours adaptive weights and $\ell1$/$\ell2$ regularizations. Dots in each color represent the performance of a tuning mechanism under different hyperparameter settings. The horizontal axis represents the test accuracy (the higher the better), and the vertical axis represents the average attack advantage (defined in Definition \ref{['def: adv']}, the lower the better. Results w/o average are in Appendix \ref{['full_ablation']}) across various attack methods. PAST (L1+Ours) outperformed others; (b) Utility-privacy trade-offs (by tuning $\alpha$) for different $\lambda$. The x-axis and y-axis are the same as (a). Within a certain range ($\lambda=0.0005,0.001$ here), the trade-off curve remains stable.
  • ...and 5 more figures