Table of Contents
Fetching ...

TAPE: Tailored Posterior Difference for Auditing of Machine Unlearning

Weiqi Wang, Zhiyi Tian, An Liu, Shui Yu

TL;DR

This work tackles the problem of auditing machine unlearning without requiring access to the original training process. It introduces TAPE, a pipeline that uses unlearned posterior differences to assess how much information about erased data is removed, by first constructing shadow models via influence estimation and then training an autoencoder-based Reconstructor. Two augmentation strategies, unlearned data perturbation (UDP) and unlearning influence-based division (UID), enable robust reconstruction for multi-sample requests. Extensive experiments across four datasets and multiple unlearning benchmarks demonstrate significant efficiency gains (up to $75\times$) and effective auditing of genuine samples for both exact and approximate unlearning, highlighting practical utility for right-to-be-forgotten scenarios in MLaaS. The approach preserves the original model's utility, does not require retraining or backdooring, and provides a scalable framework for unlearning auditing in real-world deployments.

Abstract

With the increasing prevalence of Web-based platforms handling vast amounts of user data, machine unlearning has emerged as a crucial mechanism to uphold users' right to be forgotten, enabling individuals to request the removal of their specified data from trained models. However, the auditing of machine unlearning processes remains significantly underexplored. Although some existing methods offer unlearning auditing by leveraging backdoors, these backdoor-based approaches are inefficient and impractical, as they necessitate involvement in the initial model training process to embed the backdoors. In this paper, we propose a TAilored Posterior diffErence (TAPE) method to provide unlearning auditing independently of original model training. We observe that the process of machine unlearning inherently introduces changes in the model, which contains information related to the erased data. TAPE leverages unlearning model differences to assess how much information has been removed through the unlearning operation. Firstly, TAPE mimics the unlearned posterior differences by quickly building unlearned shadow models based on first-order influence estimation. Secondly, we train a Reconstructor model to extract and evaluate the private information of the unlearned posterior differences to audit unlearning. Existing privacy reconstructing methods based on posterior differences are only feasible for model updates of a single sample. To enable the reconstruction effective for multi-sample unlearning requests, we propose two strategies, unlearned data perturbation and unlearned influence-based division, to augment the posterior difference. Extensive experimental results indicate the significant superiority of TAPE over the state-of-the-art unlearning verification methods, at least 4.5$\times$ efficiency speedup and supporting the auditing for broader unlearning scenarios.

TAPE: Tailored Posterior Difference for Auditing of Machine Unlearning

TL;DR

This work tackles the problem of auditing machine unlearning without requiring access to the original training process. It introduces TAPE, a pipeline that uses unlearned posterior differences to assess how much information about erased data is removed, by first constructing shadow models via influence estimation and then training an autoencoder-based Reconstructor. Two augmentation strategies, unlearned data perturbation (UDP) and unlearning influence-based division (UID), enable robust reconstruction for multi-sample requests. Extensive experiments across four datasets and multiple unlearning benchmarks demonstrate significant efficiency gains (up to ) and effective auditing of genuine samples for both exact and approximate unlearning, highlighting practical utility for right-to-be-forgotten scenarios in MLaaS. The approach preserves the original model's utility, does not require retraining or backdooring, and provides a scalable framework for unlearning auditing in real-world deployments.

Abstract

With the increasing prevalence of Web-based platforms handling vast amounts of user data, machine unlearning has emerged as a crucial mechanism to uphold users' right to be forgotten, enabling individuals to request the removal of their specified data from trained models. However, the auditing of machine unlearning processes remains significantly underexplored. Although some existing methods offer unlearning auditing by leveraging backdoors, these backdoor-based approaches are inefficient and impractical, as they necessitate involvement in the initial model training process to embed the backdoors. In this paper, we propose a TAilored Posterior diffErence (TAPE) method to provide unlearning auditing independently of original model training. We observe that the process of machine unlearning inherently introduces changes in the model, which contains information related to the erased data. TAPE leverages unlearning model differences to assess how much information has been removed through the unlearning operation. Firstly, TAPE mimics the unlearned posterior differences by quickly building unlearned shadow models based on first-order influence estimation. Secondly, we train a Reconstructor model to extract and evaluate the private information of the unlearned posterior differences to audit unlearning. Existing privacy reconstructing methods based on posterior differences are only feasible for model updates of a single sample. To enable the reconstruction effective for multi-sample unlearning requests, we propose two strategies, unlearned data perturbation and unlearned influence-based division, to augment the posterior difference. Extensive experimental results indicate the significant superiority of TAPE over the state-of-the-art unlearning verification methods, at least 4.5 efficiency speedup and supporting the auditing for broader unlearning scenarios.

Paper Structure

This paper contains 24 sections, 8 equations, 8 figures, 3 tables, 2 algorithms.

Figures (8)

  • Figure 1: (a) The backdoor-based verification and (b) The motivation of auditing unlearning effectiveness based on the posterior difference. The scheme (b) only involves the unlearning process rather than the initial model training.
  • Figure 2: Approximate unlearning process on genuine unlearned data $D_u$ and backdoored data $D_b$ on MNIST. During unlearning, the backdoor accuracy drops to 0% at the blue Vertical line. Meanwhile, the model accuracy on genuine unlearned data $D_u$ and test data is still around $80\%$.
  • Figure 3: The main process of the TAPE method. (a) The first part quickly builds the unlearned shadow models through first-order influence estimation based on the user's local dataset $D_{local}$ to mimic the unlearning posterior difference $\delta$. (b) Two posterior difference augment strategies are proposed to make the reconstruction suitable for multi-sample unlearning.
  • Figure 4: Auditing for different unlearning methods. TAPE consistently achieves significant efficiency improvement and a better unlearning auditing effect for a single sample (SS) than for multiple samples (MS).
  • Figure 5: Evaluations of impact about different $\text{\it ESS}$. Here, we evaluate the unlearning verification of genuine samples (GS) rather than backdoored samples for MIB.
  • ...and 3 more figures