Towards Aligned Data Removal via Twin Machine Unlearning
Haoxuan Ji, Zheng Lin, Yuyao Sun, Gao Fei, Yuhang Wang, Haichang Gao, Zhenxing Niu
TL;DR
This paper tackles data forgetting under privacy regulations by reframing unlearning as alignment between the unlearned model and the gold model. It introduces Twin Machine Unlearning (TMU), which constructs a Twin Unlearning Problem to derive a generalization-label predictor using a binary classifier trained on a twin dataset, enabling selective forgetting of hard samples while preserving easy ones. Three discriminative features—Distance Feature $DF$, Adversarial-attacking Feature $AF$, and Curriculum-learning-loss Feature $CF$—are integrated to identify hard forgetting samples, and a noise-perturbed fine-tuning objective is used to resist Membership Inference Attacks while maintaining generalization. Empirical results on CIFAR-10/100 and VGGFaces2 across multiple architectures show improved alignment (reduced $ ext{Delta}$ and Activation Distance) and maintained accuracy, demonstrating practical potential for privacy-preserving model updates.
Abstract
Modern privacy regulations have spurred the evolution of machine unlearning, a technique that enables the removal of data from an already trained ML model without requiring retraining from scratch. Previous unlearning methods tend to induce the model to achieve lowest classification accuracy on the removal data. Nonetheless, the authentic objective of machine unlearning is to align the unlearned model with the gold model, i.e., achieving the same classification accuracy as the gold model. For this purpose, we present a Twin Machine Unlearning (TMU) approach, where a twin unlearning problem is defined corresponding to the original unlearning problem. As a results, the generalization-label predictor trained on the twin problem can be transferred to the original problem, facilitating aligned data removal. Comprehensive empirical experiments illustrate that our approach significantly enhances the alignment between the unlearned model and the gold model. Meanwhile, our method allows data removal without compromising the model accuracy.
