Table of Contents
Fetching ...

Towards Natural Machine Unlearning

Zhengbao He, Tao Li, Xinwen Cheng, Zhehao Huang, Xiaolin Huang

TL;DR

This work tackles the unnaturalness and inefficiency of relabeling-based machine unlearning by proposing NatMU, a Mixup-inspired input-level approach that injects correct information from the remaining data into forgetting samples to form natural unlearning hybrids. By selecting diverse remaining instances, applying gradual Mixup masks, and labeling hybrids with the injected information, NatMU reinforces the remaining-data associations while suppressing forgotten content, yielding a smaller $\mathrm{KL}_{avg}$ and reduced privacy leakage. Empirical results across CIFAR-10/100, CIFAR-20, and TinyImageNet-200 show NatMU achieves performance close to retraining with significantly lower cost and robust behavior under class-wise and sample-wise forgetting, including difficult distribution shifts. Overall, NatMU demonstrates strong practical potential for natural, efficient machine unlearning with broad applicability and resilience to hyperparameters.

Abstract

Machine unlearning (MU) aims to eliminate information that has been learned from specific training data, namely forgetting data, from a pre-trained model. Currently, the mainstream of existing MU methods involves modifying the forgetting data with incorrect labels and subsequently fine-tuning the model. While learning such incorrect information can indeed remove knowledge, the process is quite unnatural as the unlearning process undesirably reinforces the incorrect information and leads to over-forgetting. Towards more \textit{natural} machine unlearning, we inject correct information from the remaining data to the forgetting samples when changing their labels. Through pairing these adjusted samples with their labels, the model will tend to use the injected correct information and naturally suppress the information meant to be forgotten. Albeit straightforward, such a first step towards natural machine unlearning can significantly outperform current state-of-the-art approaches. In particular, our method substantially reduces the over-forgetting and leads to strong robustness to hyperparameters, making it a promising candidate for practical machine unlearning.

Towards Natural Machine Unlearning

TL;DR

This work tackles the unnaturalness and inefficiency of relabeling-based machine unlearning by proposing NatMU, a Mixup-inspired input-level approach that injects correct information from the remaining data into forgetting samples to form natural unlearning hybrids. By selecting diverse remaining instances, applying gradual Mixup masks, and labeling hybrids with the injected information, NatMU reinforces the remaining-data associations while suppressing forgotten content, yielding a smaller and reduced privacy leakage. Empirical results across CIFAR-10/100, CIFAR-20, and TinyImageNet-200 show NatMU achieves performance close to retraining with significantly lower cost and robust behavior under class-wise and sample-wise forgetting, including difficult distribution shifts. Overall, NatMU demonstrates strong practical potential for natural, efficient machine unlearning with broad applicability and resilience to hyperparameters.

Abstract

Machine unlearning (MU) aims to eliminate information that has been learned from specific training data, namely forgetting data, from a pre-trained model. Currently, the mainstream of existing MU methods involves modifying the forgetting data with incorrect labels and subsequently fine-tuning the model. While learning such incorrect information can indeed remove knowledge, the process is quite unnatural as the unlearning process undesirably reinforces the incorrect information and leads to over-forgetting. Towards more \textit{natural} machine unlearning, we inject correct information from the remaining data to the forgetting samples when changing their labels. Through pairing these adjusted samples with their labels, the model will tend to use the injected correct information and naturally suppress the information meant to be forgotten. Albeit straightforward, such a first step towards natural machine unlearning can significantly outperform current state-of-the-art approaches. In particular, our method substantially reduces the over-forgetting and leads to strong robustness to hyperparameters, making it a promising candidate for practical machine unlearning.
Paper Structure (33 sections, 7 equations, 8 figures, 10 tables, 1 algorithm)

This paper contains 33 sections, 7 equations, 8 figures, 10 tables, 1 algorithm.

Figures (8)

  • Figure 1: Accuracy comparison of different methods on forgetting samples. The experiments are taken on CIFAR-100 using ResNet-18 under 1% sample-wise unlearning setting. The dash line is the natural forgetting accuracy of retrained model, to which a smaller gap indicates better MU. The forgetting accuracy of other methods continuously decreases after crossing over the dash line, while ours can converge to that of the retrained model. Hence, to obtain a good MU performance, other methods may need to carefully stop training at a middle point, denoted as $\star$.
  • Figure 2: Entropy distribution of different unlearned models' output on forgetting and test samples using kernel density estimation. The output distribution of NatMU is similar to Retrain on both forgetting samples and test samples. In contrast, other methods significantly change the entropy distribution of forgetting samples, thereby enabling the separation of forgetting samples from test samples and posing a serious risk of privacy leakage. The experiments are conducted on CIFAR-100 using ResNet-18 under 10% random-sample-wise unlearning setting. The number of unlearning epochs is set to 5 and the hyperparameters are selected following the setup in Section \ref{['random-sample-wise-unlearning']}.
  • Figure 3: (a) Visualization of our unlearning instances and their attention maps before and after unlearning. The attention maps are calculated with LRP lrp. After unlearning, the attention is shifted to the remaining information. (b) Relationship between the gap of forgetting accuracy and KL divergence on different datasets. A smaller KL divergence indicates a more natural MU process. NatMU's KL divergence is much smaller than other methods, i.e., more natural, thus resulting in a smaller accuracy gap.
  • Figure 4: Accuracy curves on partial unlearning instances $\{ (\boldsymbol{x}^f \circ \boldsymbol{m}, y^r)\}$ and forgetting accuracy of different MU models trained with/without correct information. The experiment is conducted on CIFAR-100 with a forgetting ratio of 1% using ResNet-18. PFA: accuracy of classifying partial forgetting samples $\boldsymbol{x}^f\circ \boldsymbol{m}$ as random label $y^r$. FA: forgetting accuracy. CI: correct information. ReT: the retrained model.
  • Figure 5: Training time of different unlearning methods. NatMU achieves comparable unlearning efficiency to other baselines, and all unlearning methods require significantly less training time than retraining from scratch.
  • ...and 3 more figures