Improved Localized Machine Unlearning Through the Lens of Memorization
Reihaneh Torkzadehmahani, Reza Nasirigerdeh, Georgios Kaissis, Daniel Rueckert, Gintare Karolina Dziugaite, Eleni Triantafillou
TL;DR
This paper tackles efficient unlearning by focusing on localized updates guided by memorization hypotheses. It introduces a practical localization strategy that uses channel-level aggregation of weighted gradients, culminating in the Deletion by Example Localization (DEL) algorithm, which resets and finetunes the deemed-critical parameters. Across CIFAR-10, SVHN, and ImageNet-100 with both IID and non-IID forget sets, DEL achieves state-of-the-art unlearning metrics and maintains, often improves, test accuracy compared to prior localized and full-parameter methods. The results indicate that tailoring parameter selection to the forget set, especially under non-IID conditions, offers meaningful gains and suggests promising directions for future memory-aware model editing.
Abstract
Machine unlearning refers to removing the influence of a specified subset of training data from a machine learning model, efficiently, after it has already been trained. This is important for key applications, including making the model more accurate by removing outdated, mislabeled, or poisoned data. In this work, we study localized unlearning, where the unlearning algorithm operates on a (small) identified subset of parameters. Drawing inspiration from the memorization literature, we propose an improved localization strategy that yields strong results when paired with existing unlearning algorithms. We also propose a new unlearning algorithm, Deletion by Example Localization (DEL), that resets the parameters deemed-to-be most critical according to our localization strategy, and then finetunes them. Our extensive experiments on different datasets, forget sets and metrics reveal that DEL sets a new state-of-the-art for unlearning metrics, against both localized and full-parameter methods, while modifying a small subset of parameters, and outperforms the state-of-the-art localized unlearning in terms of test accuracy too.
