Table of Contents
Fetching ...

Forget Vectors at Play: Universal Input Perturbations Driving Machine Unlearning in Image Classification

Changchang Sun, Ren Wang, Yihua Zhang, Jinghan Jia, Jiancheng Liu, Gaowen Liu, Yan Yan, Sijia Liu

TL;DR

This work tackles machine unlearning (MU) for image classification by introducing forget vectors, a universal input perturbation that enables unlearning without altering model weights. By formulating a data-based MU objective that combines a forget loss with retain data regularization, the approach demonstrates competitive unlearning effectiveness (UA) and strong MIA-Efficacy relative to model-based methods, while incurring trade-offs in utility (RA, TA). A key innovation is compositional unlearning via forget vector arithmetic, enabling transfer of class-wise forget vectors to generate new forget vectors for unseen forgetting tasks. Extensive experiments on CIFAR-10 and ImageNet-10 validate the method’s effectiveness, transferability, and interpretability through saliency analyses, with practical advantages in storage and parameter efficiency. The work presents a promising, scalable alternative to retraining-based MU, with potential impact on privacy-compliant data removal and rapid model editing in vision systems.

Abstract

Machine unlearning (MU), which seeks to erase the influence of specific unwanted data from already-trained models, is becoming increasingly vital in model editing, particularly to comply with evolving data regulations like the ``right to be forgotten''. Conventional approaches are predominantly model-based, typically requiring retraining or fine-tuning the model's weights to meet unlearning requirements. In this work, we approach the MU problem from a novel input perturbation-based perspective, where the model weights remain intact throughout the unlearning process. We demonstrate the existence of a proactive input-based unlearning strategy, referred to forget vector, which can be generated as an input-agnostic data perturbation and remains as effective as model-based approximate unlearning approaches. We also explore forget vector arithmetic, whereby multiple class-specific forget vectors are combined through simple operations (e.g., linear combinations) to generate new forget vectors for unseen unlearning tasks, such as forgetting arbitrary subsets across classes. Extensive experiments validate the effectiveness and adaptability of the forget vector, showcasing its competitive performance relative to state-of-the-art model-based methods. Codes are available at https://github.com/Changchangsun/Forget-Vector.

Forget Vectors at Play: Universal Input Perturbations Driving Machine Unlearning in Image Classification

TL;DR

This work tackles machine unlearning (MU) for image classification by introducing forget vectors, a universal input perturbation that enables unlearning without altering model weights. By formulating a data-based MU objective that combines a forget loss with retain data regularization, the approach demonstrates competitive unlearning effectiveness (UA) and strong MIA-Efficacy relative to model-based methods, while incurring trade-offs in utility (RA, TA). A key innovation is compositional unlearning via forget vector arithmetic, enabling transfer of class-wise forget vectors to generate new forget vectors for unseen forgetting tasks. Extensive experiments on CIFAR-10 and ImageNet-10 validate the method’s effectiveness, transferability, and interpretability through saliency analyses, with practical advantages in storage and parameter efficiency. The work presents a promising, scalable alternative to retraining-based MU, with potential impact on privacy-compliant data removal and rapid model editing in vision systems.

Abstract

Machine unlearning (MU), which seeks to erase the influence of specific unwanted data from already-trained models, is becoming increasingly vital in model editing, particularly to comply with evolving data regulations like the ``right to be forgotten''. Conventional approaches are predominantly model-based, typically requiring retraining or fine-tuning the model's weights to meet unlearning requirements. In this work, we approach the MU problem from a novel input perturbation-based perspective, where the model weights remain intact throughout the unlearning process. We demonstrate the existence of a proactive input-based unlearning strategy, referred to forget vector, which can be generated as an input-agnostic data perturbation and remains as effective as model-based approximate unlearning approaches. We also explore forget vector arithmetic, whereby multiple class-specific forget vectors are combined through simple operations (e.g., linear combinations) to generate new forget vectors for unseen unlearning tasks, such as forgetting arbitrary subsets across classes. Extensive experiments validate the effectiveness and adaptability of the forget vector, showcasing its competitive performance relative to state-of-the-art model-based methods. Codes are available at https://github.com/Changchangsun/Forget-Vector.

Paper Structure

This paper contains 36 sections, 6 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: A schematic illustration comparing our proposed data-based MU method (termed the 'forget vector'), which achieves unlearning objectives (i.e., forgetting 'dog' and remembering 'bird' in this example) by operating directly on input data without altering model parameters, against traditional model update-based unlearning methods. indicates that the forget data is successfully unlearned, while means that the retain data is correctly recognized, or the forget data is not successfully unlearned. The "original model" refers to the model without unlearning applied, and "SCRUB" kurmanji2024towards is an existing representative unlearning method that updates model weights.
  • Figure 2: The performance of class-wise forgetting on (ResNet-18, CIFAR-10) using the unlearning method Retrain vs. the (pre-unlearning) original model performance (Origin), evaluated on both benign evaluation sets (Benign) and perturbed sets, which include (1) Gaussian noise (GN) with a standard deviation of $0.08$ (termed GN1), (2) GN with a standard deviation of $0.2$ (termed GN2), (3) Elastic transformation (ET) with parameters (488, 170.8, 24.4) regarding intensity, smoothing, and offset (termed ET1), (4) ET with parameters (488, 19.52, 48.8) (termed ET2), and (5) adversarial perturbations from a 7-step PGD attack with strength $\epsilon = 8/255$. The unlearning performance metrics are reported as (a) TA (testing accuracy), (b) RA (retain accuracy), (c) UA (unlearning accuracy), and (d) MIA-Efficacy, as defined in Sec. \ref{['sec: Preliminaries']}. The average performance is reported over 10 independent trials, where each trial focuses on forgetting one specific class from CIFAR-10. Shaded regions indicate the performance variance.
  • Figure 3: The performance gap relative to Retrain for class-wise forget vector arithmetic (based on classes "automobile" and "bird") across different combination coefficients $w_1$ and $w_2$, when unlearning a randomly selected 10% of training points from these two classes of CIFAR-10. Each cell displays the gap (%) relative to Retrain at a specific weight combination, where a lower value indicates a closer performance to Retrain given a metric. A green star (★) denotes the selected weight combination scheme ($w_1$ and $w_2$) that achieves the smallest performance gap relative to Retrain, averaged over both UA Gap and RA Gap.
  • Figure 4: Gradient-based saliency map visualized via Grad-CAM for different MU methods against forget images. The highlighted areas (marked in red) indicate regions most influential to model prediction, and the red cross mark () indicates that corresponding methods effectively unlearn the input forget images while the check () signifies the opposite.
  • Figure 5: Gradient-based saliency map visualization using Grad-CAM for different MU methods against retain images.
  • ...and 5 more figures