Table of Contents
Fetching ...

Remaining-data-free Machine Unlearning by Suppressing Sample Contribution

Xinwen Cheng, Zhehao Huang, Wenxin Zhou, Zhengbao He, Ruikai Yang, Yingwen Wu, Xiaolin Huang

TL;DR

This work tackles machine unlearning (MU) under the RTBF setting by proposing MU-Mis, a remaining-data-free method that suppresses the forgetting data’s contribution without relying on the remaining data. The key idea is that a sample’s contribution to the learning process is reflected in the pre-trained model’s input sensitivity, particularly in the gap between the target-class and irrelevant-class logit gradients; MU-Mis minimizes this gap on forgetting samples. The authors demonstrate through extensive experiments across multiple datasets and forgetting tasks that MU-Mis achieves performance close to retrained models, offers strong privacy (low MIA), and exceeds existing remaining-data-free methods in both utility and resilience, while greatly improving efficiency in large-scale models. This approach provides a practical, scalable solution for RTBF in deep networks, enabling effective unlearning without retraining or surrogate data and with robust sequential-unlearning behavior.

Abstract

Machine unlearning (MU) is to forget data from a well-trained model, which is practically important due to the ``right to be forgotten''. The unlearned model should approach the retrained model, where the forgetting data are not involved in the training process and hence do not contribute to the retrained model. Considering the forgetting data's absence during retraining, we think unlearning should withdraw their contribution from the pre-trained model. The challenge is that when tracing the learning process is impractical, how to quantify and detach sample's contribution to the dynamic learning process using only the pre-trained model. We first theoretically discover that sample's contribution during the process will reflect in the learned model's sensitivity to it. We then practically design a novel method, namely MU-Mis (Machine Unlearning by Minimizing input sensitivity), to suppress the contribution of the forgetting data. Experimental results demonstrate that MU-Mis can unlearn effectively and efficiently without utilizing the remaining data. It is the first time that a remaining-data-free method can outperform state-of-the-art (SoTA) unlearning methods that utilize the remaining data.

Remaining-data-free Machine Unlearning by Suppressing Sample Contribution

TL;DR

This work tackles machine unlearning (MU) under the RTBF setting by proposing MU-Mis, a remaining-data-free method that suppresses the forgetting data’s contribution without relying on the remaining data. The key idea is that a sample’s contribution to the learning process is reflected in the pre-trained model’s input sensitivity, particularly in the gap between the target-class and irrelevant-class logit gradients; MU-Mis minimizes this gap on forgetting samples. The authors demonstrate through extensive experiments across multiple datasets and forgetting tasks that MU-Mis achieves performance close to retrained models, offers strong privacy (low MIA), and exceeds existing remaining-data-free methods in both utility and resilience, while greatly improving efficiency in large-scale models. This approach provides a practical, scalable solution for RTBF in deep networks, enabling effective unlearning without retraining or surrogate data and with robust sequential-unlearning behavior.

Abstract

Machine unlearning (MU) is to forget data from a well-trained model, which is practically important due to the ``right to be forgotten''. The unlearned model should approach the retrained model, where the forgetting data are not involved in the training process and hence do not contribute to the retrained model. Considering the forgetting data's absence during retraining, we think unlearning should withdraw their contribution from the pre-trained model. The challenge is that when tracing the learning process is impractical, how to quantify and detach sample's contribution to the dynamic learning process using only the pre-trained model. We first theoretically discover that sample's contribution during the process will reflect in the learned model's sensitivity to it. We then practically design a novel method, namely MU-Mis (Machine Unlearning by Minimizing input sensitivity), to suppress the contribution of the forgetting data. Experimental results demonstrate that MU-Mis can unlearn effectively and efficiently without utilizing the remaining data. It is the first time that a remaining-data-free method can outperform state-of-the-art (SoTA) unlearning methods that utilize the remaining data.
Paper Structure (36 sections, 12 equations, 11 figures, 13 tables, 1 algorithm)

This paper contains 36 sections, 12 equations, 11 figures, 13 tables, 1 algorithm.

Figures (11)

  • Figure 1: Input sensitivity $\Vert \nabla_x f\Vert_F$ of training data before and after training. Left: In randomly initialized model $w_0$. Right: In well-trained model $w_p$. After training, the model exhibits significantly increased sensitivity to the training data. Such an increase reflects the training data's contribution during training.
  • Figure 2: Input sensitivity $\Vert \nabla_x f_c\Vert_F$ and $\Vert \nabla_x f_{c^\prime}\Vert_F$ of training data before and after training. Left: In randomly initialized model $w_0$. Right: In well-trained model $w_p$. After training, the model learns distribution of the training data and the sensitivity magnitude gap between target class logit and irrelevant class logit increases, offering a clearer indication of the sample's contribution.
  • Figure 3: Ratio of input sensitivity difference $\Delta$ rise and fall of the forgetting data under different unlearning settings. From left to right, $\Delta$ is the sample-wise difference between the retrained and pre-trained model on $\Vert\nabla_{\boldsymbol{x}} f_c\Vert_F, \Vert\nabla_{\boldsymbol{x}} f_{c^\prime}\Vert_F$ and $\Vert\nabla_{\boldsymbol{x}} f_c\Vert_F-\Vert\nabla_{\boldsymbol{x}} f_{c^\prime}\Vert_F$. Sample's contribution to input sensitivity includes promoting $\Vert\nabla_{\boldsymbol{x}} f_c\Vert_F$ and suppressing $\Vert\nabla_{\boldsymbol{x}} f_{c^\prime}\Vert_F$, thereby enlarging the magnitude gap $\Vert\nabla_{\boldsymbol{x}} f_c\Vert_F-\Vert\nabla_{\boldsymbol{x}} f_{c^\prime}\Vert_F$.
  • Figure 4: Accuracy and optimization objective during fullclass-CIFAR100-rocket unlearning with different learning rates on ResNet-18. FA decreases gradually, RA, TA first decrease slightly and then grow up with the recovery of $\Vert\nabla_x f_{c^\prime}(x, w)\Vert_F$. The ending point of each curve corresponds to the time that $\Vert|\nabla_x f_{c^\prime}\Vert|_F$ exceeds $90\%$ of its initial value.
  • Figure 5: Disparities in Accuracy-related metrics between the unlearned model and the retrained model for full class and sub-class sequential unlearning. Left: Iteratively unlearns 5 distinct full classes ('Apples', 'Fish', 'Baby', 'Bear' and 'Beaver') of CIFAR-100. Right: Iterativly unlearns 5 sub-classes ('Orchid, 'Poppy', 'Rose', 'Sunflower', 'Tulip') of the same super-class 'Flower' of CIFAR-20. The results show that BT and Salun suffer from performance recovery on the forgotten classes, FT fails to unlearn effectively in sub-class task and SSD faces the risk of utility breakdown when parameter magnitudes are continuously scaled. While MU-Mis exhibit good unlearning utility and resilience across both tasks.
  • ...and 6 more figures