Table of Contents
Fetching ...

Evaluating of Machine Unlearning: Robustness Verification Without Prior Modifications

Heng Xu, Tianqing Zhu, Wanlei Zhou

TL;DR

This work proposes a novel robustness verification scheme, operating exclusively through model parameters, that avoids the need for any sample-level modifications prior to model training while supporting verification on a much larger set and maintaining robustness.

Abstract

Machine unlearning, a process enabling pre-trained models to remove the influence of specific training samples, has attracted significant attention in recent years. While extensive research has focused on developing efficient unlearning strategies, the critical aspect of unlearning verification has been largely overlooked. Existing verification methods mainly rely on machine learning attack techniques, such as membership inference attacks (MIAs) or backdoor attacks. However, these methods, not being formally designed for verification purposes, exhibit limitations in robustness and only support a small, predefined subset of samples. Moreover, dependence on prepared sample-level modifications of MIAs or backdoor attacks restricts their applicability in Machine Learning as a Service (MLaaS) environments. To address these limitations, we propose a novel robustness verification scheme without any prior modifications, and can support verification on a much larger set. Our scheme employs an optimization-based method to recover the actual training samples from the model. By comparative analysis of recovered samples extracted pre- and post-unlearning, MLaaS users can verify the unlearning process. This verification scheme, operating exclusively through model parameters, avoids the need for any sample-level modifications prior to model training while supporting verification on a much larger set and maintaining robustness. The effectiveness of our proposed approach is demonstrated through theoretical analysis and experiments involving diverse models on various datasets in different scenarios.

Evaluating of Machine Unlearning: Robustness Verification Without Prior Modifications

TL;DR

This work proposes a novel robustness verification scheme, operating exclusively through model parameters, that avoids the need for any sample-level modifications prior to model training while supporting verification on a much larger set and maintaining robustness.

Abstract

Machine unlearning, a process enabling pre-trained models to remove the influence of specific training samples, has attracted significant attention in recent years. While extensive research has focused on developing efficient unlearning strategies, the critical aspect of unlearning verification has been largely overlooked. Existing verification methods mainly rely on machine learning attack techniques, such as membership inference attacks (MIAs) or backdoor attacks. However, these methods, not being formally designed for verification purposes, exhibit limitations in robustness and only support a small, predefined subset of samples. Moreover, dependence on prepared sample-level modifications of MIAs or backdoor attacks restricts their applicability in Machine Learning as a Service (MLaaS) environments. To address these limitations, we propose a novel robustness verification scheme without any prior modifications, and can support verification on a much larger set. Our scheme employs an optimization-based method to recover the actual training samples from the model. By comparative analysis of recovered samples extracted pre- and post-unlearning, MLaaS users can verify the unlearning process. This verification scheme, operating exclusively through model parameters, avoids the need for any sample-level modifications prior to model training while supporting verification on a much larger set and maintaining robustness. The effectiveness of our proposed approach is demonstrated through theoretical analysis and experiments involving diverse models on various datasets in different scenarios.

Paper Structure

This paper contains 29 sections, 21 equations, 14 figures, 2 algorithms.

Figures (14)

  • Figure 1: Existing verification scheme process.
  • Figure 2: Our verification scheme process.
  • Figure 3: Verification results for unlearning samples under sample-level unlearning requests across different schemes. The value on the Y-axis represents different metrics depending on the X-axis: INA for membership inference attacks-based scheme (MIAs), ASR for backdoor-based schemes (Backdoor), accuracy for accuracy-based schemes (Accuracy), and SSIM for our proposed scheme (Our). Only the backdoor-based scheme and our scheme show significant changes regarding the unlearning samples.
  • Figure 4: Verification results for remaining samples under sample-level unlearning requests across different schemes. Y-axis also represents different metrics depending on the X-axis: INA for MIAs-based, ASR for backdoor-based, accuracy for accuracy-based, and SSIM for our proposed scheme. All schemes do not show significant changes for the remaining samples.
  • Figure 5: Original and recovered samples under the sample-level unlearning requests. Figure \ref{['fig:sample_orginal']} shows two sets: original unlearning samples on the left and remaining samples on the right. Figures \ref{['fig:sample_before']} and \ref{['fig:sample_after']} present the recovered samples before and after the unlearning process, respectively.
  • ...and 9 more figures

Theorems & Definitions (1)

  • Definition 1: Machine Unlearning DBLP:journals/csur/XuZZZY24