Table of Contents
Fetching ...

Technical Report for the Forgotten-by-Design Project: Targeted Obfuscation for Machine Learning

Rickard Brännvall, Laurynas Adomaitis, Olof Görnerup, Anass Sedrati

TL;DR

This paper introduces Forgotten by Design, a proactive approach to privacy preservation that integrates instance-specific obfuscation techniques during the AI model training process, and presents visualization methods for the privacy-utility trade-off, providing a clear framework for balancing privacy risk and model accuracy.

Abstract

The right to privacy, enshrined in various human rights declarations, faces new challenges in the age of artificial intelligence (AI). This paper explores the concept of the Right to be Forgotten (RTBF) within AI systems, contrasting it with traditional data erasure methods. We introduce Forgotten by Design, a proactive approach to privacy preservation that integrates instance-specific obfuscation techniques during the AI model training process. Unlike machine unlearning, which modifies models post-training, our method prevents sensitive data from being embedded in the first place. Using the LIRA membership inference attack, we identify vulnerable data points and propose defenses that combine additive gradient noise and weighting schemes. Our experiments on the CIFAR-10 dataset demonstrate that our techniques reduce privacy risks by at least an order of magnitude while maintaining model accuracy (at 95% significance). Additionally, we present visualization methods for the privacy-utility trade-off, providing a clear framework for balancing privacy risk and model accuracy. This work contributes to the development of privacy-preserving AI systems that align with human cognitive processes of motivated forgetting, offering a robust framework for safeguarding sensitive information and ensuring compliance with privacy regulations.

Technical Report for the Forgotten-by-Design Project: Targeted Obfuscation for Machine Learning

TL;DR

This paper introduces Forgotten by Design, a proactive approach to privacy preservation that integrates instance-specific obfuscation techniques during the AI model training process, and presents visualization methods for the privacy-utility trade-off, providing a clear framework for balancing privacy risk and model accuracy.

Abstract

The right to privacy, enshrined in various human rights declarations, faces new challenges in the age of artificial intelligence (AI). This paper explores the concept of the Right to be Forgotten (RTBF) within AI systems, contrasting it with traditional data erasure methods. We introduce Forgotten by Design, a proactive approach to privacy preservation that integrates instance-specific obfuscation techniques during the AI model training process. Unlike machine unlearning, which modifies models post-training, our method prevents sensitive data from being embedded in the first place. Using the LIRA membership inference attack, we identify vulnerable data points and propose defenses that combine additive gradient noise and weighting schemes. Our experiments on the CIFAR-10 dataset demonstrate that our techniques reduce privacy risks by at least an order of magnitude while maintaining model accuracy (at 95% significance). Additionally, we present visualization methods for the privacy-utility trade-off, providing a clear framework for balancing privacy risk and model accuracy. This work contributes to the development of privacy-preserving AI systems that align with human cognitive processes of motivated forgetting, offering a robust framework for safeguarding sensitive information and ensuring compliance with privacy regulations.
Paper Structure (51 sections, 14 equations, 7 figures, 4 tables)

This paper contains 51 sections, 14 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: ROC Curve, for example, LIRA Membership Inference Attack on ResNet-18 Model Trained on CIFAR-10 Dataset. The curve illustrates the True Positive Rate (TPR) against the False Positive Rate (FPR) for various threshold settings, demonstrating the model's ability to distinguish between members and non-members of the training dataset. The attack uses logit scaling and Gaussian distributions to model the observations, employing a likelihood ratio test for decision-making. The performance of the LIRA attack is compared to a random attack, showing a significant improvement in identifying training data points over random guessing
  • Figure 2: ROC Curves for LIRA Membership Inference Attack on ResNet-18 Model Trained on CIFAR-10 Dataset comparing the performance of the attack under different scenarios: baseline ("uniform: 0.0") with no weighting, uniform noise addition (0.01 and 0.02), and noise addition under a weighting scheme. Both adding obfuscating noise and down-weighting data points with high privacy risk push the ROC curve toward the random baseline. The combined effect of noise and weights significantly reduces the attacker's advantage, approaching random guessing as noise increases.
  • Figure 3: Privacy-Utility Trade-Off for Different Noise Levels and Weighting Schemes. The scatter plots illustrate the relationship between privacy risk metrics, AUC and tau (log ratio of TPR over FPR), and model accuracy for different scenarios: baseline (no noise, no weights), uniform noise addition, and noise addition under a weighting scheme, where each dot is based on at least 300 shadow models. The curves show that as noise magnitude increases, privacy risk metrics decrease, initially with minimal (or even slightly positive) impact on accuracy. The red dashed line displays the convex hull for the weighted approach which demonstrates a more favorable accuracy for intermediate risk levels compared to the blue dotted line for uniform noise addition only, indicating an advantage in balancing privacy and utility.
  • Figure 4: Distribution of privacy vulnerability t-scores for Different Noise Levels and Weighting Schemes. The figure shows the distribution of t-scores for various scenarios: baseline (no noise, no weights), uniform noise addition, and noise addition under a weighting scheme. The results indicate that both noise addition and weighting shift the distribution of t-scores towards lower values, reducing the privacy risk. The combined effect of noise and weighting is particularly effective in lowering the highest privacy vulnerability t-scores.
  • Figure 5: Distribution of privacy vulnerability t-scores sorted by the original ranking from the baseline model. The figure shows the t-scores for various scenarios: baseline (blue solid line, $\sigma = 0.0$), noise addition only (green solid line, $\sigma = 0.01$), weighting only (blue dashed line, $\sigma = 0.0$), and both noise addition and weighting (green dashed line, $\sigma = 0.01$). The application of noise and weighting shifts the distribution towards lower t-scores, reducing privacy risk. This effect is particularly noticeable for the data points that have the highest original risk scores (i.e., without noise and weighting).
  • ...and 2 more figures