Table of Contents
Fetching ...

How Does Overparameterization Affect Machine Unlearning of Deep Neural Networks?

Gal Alon, Yehuda Dar

TL;DR

The paper investigates how the parameterization level of deep neural networks, captured by width, affects machine unlearning for removing specific training data. It develops a validation-based hyperparameter-tuning framework for SCRUB, NegGrad, and L1 sparsity unlearning methods and evaluates two unlearning goals—privacy of forgotten data and bias removal—across under- and overparameterized DNNs on multiple datasets. The key findings show that overparameterized models generally achieve a better balance between maintaining generalization and fulfilling the unlearning goals, with bias removal requiring explicit use of the forget set, and privacy improvements explained by localized changes in decision regions around forget samples. These results provide guidance for selecting architectures and unlearning methods in practice and motivate future parameterization-aware unlearning approaches.

Abstract

Machine unlearning is the task of updating a trained model to forget specific training data without retraining from scratch. In this paper, we investigate how unlearning of deep neural networks (DNNs) is affected by the model parameterization level, which corresponds here to the DNN width. We define validation-based tuning for several unlearning methods from the recent literature, and show how these methods perform differently depending on (i) the DNN parameterization level, (ii) the unlearning goal (unlearned data privacy or bias removal), (iii) whether the unlearning method explicitly uses the unlearned examples. Our results show that unlearning excels on overparameterized models, in terms of balancing between generalization and achieving the unlearning goal; although for bias removal this requires the unlearning method to use the unlearned examples. We further elucidate our error-based analysis by measuring how much the unlearning changes the classification decision regions in the proximity of the unlearned examples, and avoids changing them elsewhere. By this we show that the unlearning success for overparameterized models stems from the ability to delicately change the model functionality in small regions in the input space while keeping much of the model functionality unchanged.

How Does Overparameterization Affect Machine Unlearning of Deep Neural Networks?

TL;DR

The paper investigates how the parameterization level of deep neural networks, captured by width, affects machine unlearning for removing specific training data. It develops a validation-based hyperparameter-tuning framework for SCRUB, NegGrad, and L1 sparsity unlearning methods and evaluates two unlearning goals—privacy of forgotten data and bias removal—across under- and overparameterized DNNs on multiple datasets. The key findings show that overparameterized models generally achieve a better balance between maintaining generalization and fulfilling the unlearning goals, with bias removal requiring explicit use of the forget set, and privacy improvements explained by localized changes in decision regions around forget samples. These results provide guidance for selecting architectures and unlearning methods in practice and motivate future parameterization-aware unlearning approaches.

Abstract

Machine unlearning is the task of updating a trained model to forget specific training data without retraining from scratch. In this paper, we investigate how unlearning of deep neural networks (DNNs) is affected by the model parameterization level, which corresponds here to the DNN width. We define validation-based tuning for several unlearning methods from the recent literature, and show how these methods perform differently depending on (i) the DNN parameterization level, (ii) the unlearning goal (unlearned data privacy or bias removal), (iii) whether the unlearning method explicitly uses the unlearned examples. Our results show that unlearning excels on overparameterized models, in terms of balancing between generalization and achieving the unlearning goal; although for bias removal this requires the unlearning method to use the unlearned examples. We further elucidate our error-based analysis by measuring how much the unlearning changes the classification decision regions in the proximity of the unlearned examples, and avoids changing them elsewhere. By this we show that the unlearning success for overparameterized models stems from the ability to delicately change the model functionality in small regions in the input space while keeping much of the model functionality unchanged.

Paper Structure

This paper contains 29 sections, 11 equations, 30 figures.

Figures (30)

  • Figure 1: Unlearning for privacy (ResNet-18, CIFAR-10, 200 unlearned examples). In this plot, unlearning is optimized for bias removal. The marker sizes correspond to the validation $\lambda$ values (a larger marker indicates a larger $\lambda$). The $\lambda$ values used are 0.2, 0.4, and 0.6. See extended results in Appendix Fig. \ref{['fig:cifar10 resnet18 privacy results']}.
  • Figure 2: Unlearning for bias removal (ResNet-18, CIFAR-10, 200 unlearned examples). In this plot, unlearning is optimized for bias removal. The marker sizes correspond to the validation $\lambda$ values (a larger marker indicates a larger $\lambda$). The $\lambda$ values used are 0.15, 0.3, and 0.5. See extended results in Appendix Fig. \ref{['fig:cifar10 resnet18 bias results']}.
  • Figure 3: Unlearning for privacy (ResNet-18, CIFAR-10, 200 unlearned examples) MIA accuracy results. Best privacy corresponds to the horizontal level of MIA accuracy 0.5.
  • Figure 4: Decision regions (ResNet-18, CIFAR-10, 200 unlearned examples), comparing an overparametrized model (DNN width scale = 1) to an underparametrized model (DNN width scale = 0.1) . Validation for bias removal is with $\lambda$=0.3, and for privacy is with $\lambda$=0.1
  • Figure 5: Similarity and change scores for unlearned models (ResNet-18, CIFAR-10, 200 unlearned examples). The unlearning goal is privacy with unlearning validation with $\lambda=0.4$. See extended results in Fig. \ref{['fig:similarity_change_privacy_resnet']}.
  • ...and 25 more figures