Table of Contents
Fetching ...

Towards Aligned Data Removal via Twin Machine Unlearning

Haoxuan Ji, Zheng Lin, Yuyao Sun, Gao Fei, Yuhang Wang, Haichang Gao, Zhenxing Niu

TL;DR

This paper tackles data forgetting under privacy regulations by reframing unlearning as alignment between the unlearned model and the gold model. It introduces Twin Machine Unlearning (TMU), which constructs a Twin Unlearning Problem to derive a generalization-label predictor using a binary classifier trained on a twin dataset, enabling selective forgetting of hard samples while preserving easy ones. Three discriminative features—Distance Feature $DF$, Adversarial-attacking Feature $AF$, and Curriculum-learning-loss Feature $CF$—are integrated to identify hard forgetting samples, and a noise-perturbed fine-tuning objective is used to resist Membership Inference Attacks while maintaining generalization. Empirical results on CIFAR-10/100 and VGGFaces2 across multiple architectures show improved alignment (reduced $ ext{Delta}$ and Activation Distance) and maintained accuracy, demonstrating practical potential for privacy-preserving model updates.

Abstract

Modern privacy regulations have spurred the evolution of machine unlearning, a technique that enables the removal of data from an already trained ML model without requiring retraining from scratch. Previous unlearning methods tend to induce the model to achieve lowest classification accuracy on the removal data. Nonetheless, the authentic objective of machine unlearning is to align the unlearned model with the gold model, i.e., achieving the same classification accuracy as the gold model. For this purpose, we present a Twin Machine Unlearning (TMU) approach, where a twin unlearning problem is defined corresponding to the original unlearning problem. As a results, the generalization-label predictor trained on the twin problem can be transferred to the original problem, facilitating aligned data removal. Comprehensive empirical experiments illustrate that our approach significantly enhances the alignment between the unlearned model and the gold model. Meanwhile, our method allows data removal without compromising the model accuracy.

Towards Aligned Data Removal via Twin Machine Unlearning

TL;DR

This paper tackles data forgetting under privacy regulations by reframing unlearning as alignment between the unlearned model and the gold model. It introduces Twin Machine Unlearning (TMU), which constructs a Twin Unlearning Problem to derive a generalization-label predictor using a binary classifier trained on a twin dataset, enabling selective forgetting of hard samples while preserving easy ones. Three discriminative features—Distance Feature , Adversarial-attacking Feature , and Curriculum-learning-loss Feature —are integrated to identify hard forgetting samples, and a noise-perturbed fine-tuning objective is used to resist Membership Inference Attacks while maintaining generalization. Empirical results on CIFAR-10/100 and VGGFaces2 across multiple architectures show improved alignment (reduced and Activation Distance) and maintained accuracy, demonstrating practical potential for privacy-preserving model updates.

Abstract

Modern privacy regulations have spurred the evolution of machine unlearning, a technique that enables the removal of data from an already trained ML model without requiring retraining from scratch. Previous unlearning methods tend to induce the model to achieve lowest classification accuracy on the removal data. Nonetheless, the authentic objective of machine unlearning is to align the unlearned model with the gold model, i.e., achieving the same classification accuracy as the gold model. For this purpose, we present a Twin Machine Unlearning (TMU) approach, where a twin unlearning problem is defined corresponding to the original unlearning problem. As a results, the generalization-label predictor trained on the twin problem can be transferred to the original problem, facilitating aligned data removal. Comprehensive empirical experiments illustrate that our approach significantly enhances the alignment between the unlearned model and the gold model. Meanwhile, our method allows data removal without compromising the model accuracy.
Paper Structure (23 sections, 6 equations, 3 figures, 13 tables)

This paper contains 23 sections, 6 equations, 3 figures, 13 tables.

Figures (3)

  • Figure 1: Construction of the Twin Model and corresponding Twin Unlearning Problem.
  • Figure 2: The workflow of our approach. We first construct the twin model by fine-tuning the $M_o$ with $D_{test}$ to produce $M_t$. And then, we extract discriminative features and train a binary classifier on $D_{test}$. Consequently, we transfer the binary classifier to the original problem to predict the generalization-labels on $D_f$. Finally, we reduce classification accuracy on $D_f^h$.
  • Figure 3: The quality of alignment varies with the increase in the size of $D_f$.