Towards Aligned Data Removal via Twin Machine Unlearning

Haoxuan Ji; Zheng Lin; Yuyao Sun; Gao Fei; Yuhang Wang; Haichang Gao; Zhenxing Niu

Towards Aligned Data Removal via Twin Machine Unlearning

Haoxuan Ji, Zheng Lin, Yuyao Sun, Gao Fei, Yuhang Wang, Haichang Gao, Zhenxing Niu

TL;DR

This paper tackles data forgetting under privacy regulations by reframing unlearning as alignment between the unlearned model and the gold model. It introduces Twin Machine Unlearning (TMU), which constructs a Twin Unlearning Problem to derive a generalization-label predictor using a binary classifier trained on a twin dataset, enabling selective forgetting of hard samples while preserving easy ones. Three discriminative features—Distance Feature $DF$, Adversarial-attacking Feature $AF$, and Curriculum-learning-loss Feature $CF$—are integrated to identify hard forgetting samples, and a noise-perturbed fine-tuning objective is used to resist Membership Inference Attacks while maintaining generalization. Empirical results on CIFAR-10/100 and VGGFaces2 across multiple architectures show improved alignment (reduced $ ext{Delta}$ and Activation Distance) and maintained accuracy, demonstrating practical potential for privacy-preserving model updates.

Abstract

Modern privacy regulations have spurred the evolution of machine unlearning, a technique that enables the removal of data from an already trained ML model without requiring retraining from scratch. Previous unlearning methods tend to induce the model to achieve lowest classification accuracy on the removal data. Nonetheless, the authentic objective of machine unlearning is to align the unlearned model with the gold model, i.e., achieving the same classification accuracy as the gold model. For this purpose, we present a Twin Machine Unlearning (TMU) approach, where a twin unlearning problem is defined corresponding to the original unlearning problem. As a results, the generalization-label predictor trained on the twin problem can be transferred to the original problem, facilitating aligned data removal. Comprehensive empirical experiments illustrate that our approach significantly enhances the alignment between the unlearned model and the gold model. Meanwhile, our method allows data removal without compromising the model accuracy.

Towards Aligned Data Removal via Twin Machine Unlearning

TL;DR

, Adversarial-attacking Feature

, and Curriculum-learning-loss Feature

—are integrated to identify hard forgetting samples, and a noise-perturbed fine-tuning objective is used to resist Membership Inference Attacks while maintaining generalization. Empirical results on CIFAR-10/100 and VGGFaces2 across multiple architectures show improved alignment (reduced

and Activation Distance) and maintained accuracy, demonstrating practical potential for privacy-preserving model updates.

Abstract

Paper Structure (23 sections, 6 equations, 3 figures, 13 tables)

This paper contains 23 sections, 6 equations, 3 figures, 13 tables.

Introduction
Problem of Data Forgetting
Alignment in Data Forgetting
Our Approach
Twin Unlearning Problem
Discriminative Features
Distance Feature (DF)
Adversarial-attacking Feature (AF)
Curriculum-learning-loss Feature (CF)
Binary Classifier
Evaluation
Experimental setting
Implementation Details
Main Results
Ablation Study
...and 8 more sections

Figures (3)

Figure 1: Construction of the Twin Model and corresponding Twin Unlearning Problem.
Figure 2: The workflow of our approach. We first construct the twin model by fine-tuning the $M_o$ with $D_{test}$ to produce $M_t$. And then, we extract discriminative features and train a binary classifier on $D_{test}$. Consequently, we transfer the binary classifier to the original problem to predict the generalization-labels on $D_f$. Finally, we reduce classification accuracy on $D_f^h$.
Figure 3: The quality of alignment varies with the increase in the size of $D_f$.

Towards Aligned Data Removal via Twin Machine Unlearning

TL;DR

Abstract

Towards Aligned Data Removal via Twin Machine Unlearning

Authors

TL;DR

Abstract

Table of Contents

Figures (3)