Table of Contents
Fetching ...

A hybrid framework for effective and efficient machine unlearning

Mingxin Li, Yizhen Yu, Ning Wang, Zhigang Wang, Xiaodong Wang, Haipeng Qu, Jia Xu, Shen Su, Zhichao Yin

TL;DR

The paper addresses the privacy challenge of removing a revoked sample's influence from trained models without the prohibitive cost of retraining from scratch. It introduces a hybrid machine unlearning framework that combines exact MU (SISA-style) and approximate MU (DPUS) by estimating the retraining workload and dynamically selecting between partial retraining and direct parameter updates to reduce overhead while preserving accuracy. An optimized variant further mitigates accuracy loss when handling multiple revocation requests by subtracting contributions and fine-tuning a subset of slices to convergence. Experiments on four real datasets show unlearning efficiency gains of $1.5\times$ to $8\times$ with comparable accuracy, and membership inference evaluations corroborate the effectiveness of the unlearning.

Abstract

Recently machine unlearning (MU) is proposed to remove the imprints of revoked samples from the already trained model parameters, to solve users' privacy concern. Different from the runtime expensive retraining from scratch, there exist two research lines, exact MU and approximate MU with different favorites in terms of accuracy and efficiency. In this paper, we present a novel hybrid strategy on top of them to achieve an overall success. It implements the unlearning operation with an acceptable computation cost, while simultaneously improving the accuracy as much as possible. Specifically, it runs reasonable unlearning techniques by estimating the retraining workloads caused by revocations. If the workload is lightweight, it performs retraining to derive the model parameters consistent with the accurate ones retrained from scratch. Otherwise, it outputs the unlearned model by directly modifying the current parameters, for better efficiency. In particular, to improve the accuracy in the latter case, we propose an optimized version to amend the output model with lightweight runtime penalty. We particularly study the boundary of two approaches in our frameworks to adaptively make the smart selection. Extensive experiments on real datasets validate that our proposals can improve the unlearning efficiency by 1.5$\times$ to 8$\times$ while achieving comparable accuracy.

A hybrid framework for effective and efficient machine unlearning

TL;DR

The paper addresses the privacy challenge of removing a revoked sample's influence from trained models without the prohibitive cost of retraining from scratch. It introduces a hybrid machine unlearning framework that combines exact MU (SISA-style) and approximate MU (DPUS) by estimating the retraining workload and dynamically selecting between partial retraining and direct parameter updates to reduce overhead while preserving accuracy. An optimized variant further mitigates accuracy loss when handling multiple revocation requests by subtracting contributions and fine-tuning a subset of slices to convergence. Experiments on four real datasets show unlearning efficiency gains of to with comparable accuracy, and membership inference evaluations corroborate the effectiveness of the unlearning.

Abstract

Recently machine unlearning (MU) is proposed to remove the imprints of revoked samples from the already trained model parameters, to solve users' privacy concern. Different from the runtime expensive retraining from scratch, there exist two research lines, exact MU and approximate MU with different favorites in terms of accuracy and efficiency. In this paper, we present a novel hybrid strategy on top of them to achieve an overall success. It implements the unlearning operation with an acceptable computation cost, while simultaneously improving the accuracy as much as possible. Specifically, it runs reasonable unlearning techniques by estimating the retraining workloads caused by revocations. If the workload is lightweight, it performs retraining to derive the model parameters consistent with the accurate ones retrained from scratch. Otherwise, it outputs the unlearned model by directly modifying the current parameters, for better efficiency. In particular, to improve the accuracy in the latter case, we propose an optimized version to amend the output model with lightweight runtime penalty. We particularly study the boundary of two approaches in our frameworks to adaptively make the smart selection. Extensive experiments on real datasets validate that our proposals can improve the unlearning efficiency by 1.5 to 8 while achieving comparable accuracy.

Paper Structure

This paper contains 12 sections, 1 equation, 5 figures, 3 algorithms.

Figures (5)

  • Figure 1: An example of SISA.
  • Figure 2: An example of hybrid strategy.
  • Figure 3: An example of optimized hybrid strategy
  • Figure 4: The comparison of accuracy among direct parameters update strategy(DPUS), hybrid strategy(HS), partial retraining(SISA) and optimized hybrid strategy(OHS) across the four datasets.
  • Figure 5: The comparison of unlearning time among direct parameters update(DPUS) , partial retraining(SISA), hybrid strategy(HS) and optimized hybrid strategy(OHS) across the four datasets.