Table of Contents
Fetching ...

Improved Localized Machine Unlearning Through the Lens of Memorization

Reihaneh Torkzadehmahani, Reza Nasirigerdeh, Georgios Kaissis, Daniel Rueckert, Gintare Karolina Dziugaite, Eleni Triantafillou

TL;DR

This paper tackles efficient unlearning by focusing on localized updates guided by memorization hypotheses. It introduces a practical localization strategy that uses channel-level aggregation of weighted gradients, culminating in the Deletion by Example Localization (DEL) algorithm, which resets and finetunes the deemed-critical parameters. Across CIFAR-10, SVHN, and ImageNet-100 with both IID and non-IID forget sets, DEL achieves state-of-the-art unlearning metrics and maintains, often improves, test accuracy compared to prior localized and full-parameter methods. The results indicate that tailoring parameter selection to the forget set, especially under non-IID conditions, offers meaningful gains and suggests promising directions for future memory-aware model editing.

Abstract

Machine unlearning refers to removing the influence of a specified subset of training data from a machine learning model, efficiently, after it has already been trained. This is important for key applications, including making the model more accurate by removing outdated, mislabeled, or poisoned data. In this work, we study localized unlearning, where the unlearning algorithm operates on a (small) identified subset of parameters. Drawing inspiration from the memorization literature, we propose an improved localization strategy that yields strong results when paired with existing unlearning algorithms. We also propose a new unlearning algorithm, Deletion by Example Localization (DEL), that resets the parameters deemed-to-be most critical according to our localization strategy, and then finetunes them. Our extensive experiments on different datasets, forget sets and metrics reveal that DEL sets a new state-of-the-art for unlearning metrics, against both localized and full-parameter methods, while modifying a small subset of parameters, and outperforms the state-of-the-art localized unlearning in terms of test accuracy too.

Improved Localized Machine Unlearning Through the Lens of Memorization

TL;DR

This paper tackles efficient unlearning by focusing on localized updates guided by memorization hypotheses. It introduces a practical localization strategy that uses channel-level aggregation of weighted gradients, culminating in the Deletion by Example Localization (DEL) algorithm, which resets and finetunes the deemed-critical parameters. Across CIFAR-10, SVHN, and ImageNet-100 with both IID and non-IID forget sets, DEL achieves state-of-the-art unlearning metrics and maintains, often improves, test accuracy compared to prior localized and full-parameter methods. The results indicate that tailoring parameter selection to the forget set, especially under non-IID conditions, offers meaningful gains and suggests promising directions for future memory-aware model editing.

Abstract

Machine unlearning refers to removing the influence of a specified subset of training data from a machine learning model, efficiently, after it has already been trained. This is important for key applications, including making the model more accurate by removing outdated, mislabeled, or poisoned data. In this work, we study localized unlearning, where the unlearning algorithm operates on a (small) identified subset of parameters. Drawing inspiration from the memorization literature, we propose an improved localization strategy that yields strong results when paired with existing unlearning algorithms. We also propose a new unlearning algorithm, Deletion by Example Localization (DEL), that resets the parameters deemed-to-be most critical according to our localization strategy, and then finetunes them. Our extensive experiments on different datasets, forget sets and metrics reveal that DEL sets a new state-of-the-art for unlearning metrics, against both localized and full-parameter methods, while modifying a small subset of parameters, and outperforms the state-of-the-art localized unlearning in terms of test accuracy too.

Paper Structure

This paper contains 21 sections, 2 equations, 4 figures, 10 tables, 1 algorithm.

Figures (4)

  • Figure 1: Localized unlearning consists of two parts: a localization strategy that identifies a set of "critical parameters" (dashed line circles) and an unlearning algorithm that aims to remove the influence of the forget set by modifying only the critical parameters (highlighted circles), keeping the rest unchanged. Ideally, the unlearned model should "behave" like the model retrained from scratch, i.e. the two should produce the same (distribution of) outputs; see Definition \ref{['defn:unlearning']}
  • Figure 2: Comparison of localization strategies combined with the Reset + Finetune (RFT) unlearning algorithm. An ideal unlearning algorithm would match the "oracle" ("retrain-from-scratch") on each metric, with the smallest possible parameter budget, for increased efficiency. The strategy we will propose later ("ours") yields the best trade-off, with near-perfect unlearning for several budgets.
  • Figure 3: Pairing localization strategies / budgets (e.g. Ours-30% denotes applying our localization strategy to select 30% of parameters) with three unlearning algorithms, on CIFAR-10 / ResNet (the ideal behaviour is to match the "Oracle"). Our method has the best unlearning efficacy, paired with any unlearning algorithm, and its performance degrades much less than SalLoc's when the budget reduces from 30% to 20%; meanwhile, it has no worse (or better) test accuracy.
  • Figure 4: On SVHN with ViT, DEL outperforms state-of-the-art full-parameter and localized unlearning in terms of unlearning quality. L1-sparse has better test accuracy than DEL but has poor unlearning performance. These results are for the non-IID forget set, and $\alpha = 30 \%$ for localized methods; see Table \ref{['tab:sota_comparisons_svhn_vit']} for full results.

Theorems & Definitions (3)

  • Definition 2.1
  • Definition 2.2
  • Definition A.1