Table of Contents
Fetching ...

Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning

Nexhi Sula, Abhinav Kumar, Jie Hou, Han Wang, Reza Tourani

TL;DR

The paper addresses the challenge of removing a data subject's influence from trained neural networks under GDPR without full retraining. It introduces ReMI, a machine unlearning framework that uses a privacy approximation function, such as membership inference or membership fingerprinting, to guide weight refinement and minimize leakage from forgotten data while preserving primary-task performance. A novel unlearning loss combines the target-model loss with a leakage term and is augmented by a KL-divergence-based objective to reduce distinguishability between forgotten data and out-of-sample data; a Gaussian-based upper bound is provided for tractability. Empirically, ReMI demonstrates strong unlearning efficacy and latency advantages across four datasets and four architectures, outperforming naive retraining and Fisher unlearning in several settings and enabling rapid, privacy-preserving forgetting with maintained accuracy.

Abstract

With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as GDPR, a secure machine learning framework should not only grant users the right to request the removal of their contributed data used for model training but also facilitates the elimination of sensitive data fingerprints within machine learning models to mitigate potential attack - a process referred to as machine unlearning. In this study, we present a novel unlearning mechanism designed to effectively remove the impact of specific data samples from a neural network while considering the performance of the unlearned model on the primary task. In achieving this goal, we crafted a novel loss function tailored to eliminate privacy-sensitive information from weights and activation values of the target model by combining target classification loss and membership inference loss. Our adaptable framework can easily incorporate various privacy leakage approximation mechanisms to guide the unlearning process. We provide empirical evidence of the effectiveness of our unlearning approach with a theoretical upper-bound analysis through a membership inference mechanism as a proof of concept. Our results showcase the superior performance of our approach in terms of unlearning efficacy and latency as well as the fidelity of the primary task, across four datasets and four deep learning architectures.

Silver Linings in the Shadows: Harnessing Membership Inference for Machine Unlearning

TL;DR

The paper addresses the challenge of removing a data subject's influence from trained neural networks under GDPR without full retraining. It introduces ReMI, a machine unlearning framework that uses a privacy approximation function, such as membership inference or membership fingerprinting, to guide weight refinement and minimize leakage from forgotten data while preserving primary-task performance. A novel unlearning loss combines the target-model loss with a leakage term and is augmented by a KL-divergence-based objective to reduce distinguishability between forgotten data and out-of-sample data; a Gaussian-based upper bound is provided for tractability. Empirically, ReMI demonstrates strong unlearning efficacy and latency advantages across four datasets and four architectures, outperforming naive retraining and Fisher unlearning in several settings and enabling rapid, privacy-preserving forgetting with maintained accuracy.

Abstract

With the continued advancement and widespread adoption of machine learning (ML) models across various domains, ensuring user privacy and data security has become a paramount concern. In compliance with data privacy regulations, such as GDPR, a secure machine learning framework should not only grant users the right to request the removal of their contributed data used for model training but also facilitates the elimination of sensitive data fingerprints within machine learning models to mitigate potential attack - a process referred to as machine unlearning. In this study, we present a novel unlearning mechanism designed to effectively remove the impact of specific data samples from a neural network while considering the performance of the unlearned model on the primary task. In achieving this goal, we crafted a novel loss function tailored to eliminate privacy-sensitive information from weights and activation values of the target model by combining target classification loss and membership inference loss. Our adaptable framework can easily incorporate various privacy leakage approximation mechanisms to guide the unlearning process. We provide empirical evidence of the effectiveness of our unlearning approach with a theoretical upper-bound analysis through a membership inference mechanism as a proof of concept. Our results showcase the superior performance of our approach in terms of unlearning efficacy and latency as well as the fidelity of the primary task, across four datasets and four deep learning architectures.
Paper Structure (16 sections, 17 equations, 14 figures, 6 tables)

This paper contains 16 sections, 17 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: In ReMI, the target model training follows the conventional training approach (§ \ref{['sec:target']}). ReMI uses a privacy approximation function (§ \ref{['sec:attack']}) to infer the privacy-sensitive information of the training data and uses it to guide the unlearning process (§ \ref{['sec:unlearning']}).
  • Figure 2: Membership inference attack probability distributions before and after unlearning. Before unlearning, the MIA attack had a high likelihood of forgetting data and an extremely low likelihood of the out-of-sample data. The objective of our unlearning process is to minimize the divergence between these probability distributions.
  • Figure 3: The train and test classification accuracy of four datasets across four deep learning architectures before unlearning.
  • Figure 4: Accuracy of the privacy approximation functions (MIA and MF) before unlearning in both white-box and black-box forms across all datasets. The results suggest that MIA and MF models perform similarly. Thus, either of these models can be used as the privacy approximation function to guide ReMI's unlearning process.
  • Figure 5: Distribution of efficacy scores on forgetting data for the target models before and after unlearning across four datasets which shows how much information the model can leak. The results indicate that our unlearning method, ReMI, exposes less information compared to the original target model before unlearning. Additionally, the efficacy of the unlearning algorithms exhibits variations depending on the complexity of the deep learning architecture employed in the target model and the size of the forgetting dataset.
  • ...and 9 more figures

Theorems & Definitions (1)

  • Definition 1