Table of Contents
Fetching ...

Deep Unlearning: Fast and Efficient Gradient-free Approach to Class Forgetting

Sangamesh Kodge, Gobinda Saha, Kaushik Roy

TL;DR

This work introduces a novel class unlearning algorithm designed to strategically eliminate specific classes from the learned model that exhibits competitive unlearning performance and resilience against Membership Inference Attacks (MIA).

Abstract

Machine unlearning is a prominent and challenging field, driven by regulatory demands for user data deletion and heightened privacy awareness. Existing approaches involve retraining model or multiple finetuning steps for each deletion request, often constrained by computational limits and restricted data access. In this work, we introduce a novel class unlearning algorithm designed to strategically eliminate specific classes from the learned model. Our algorithm first estimates the Retain and the Forget Spaces using Singular Value Decomposition on the layerwise activations for a small subset of samples from the retain and unlearn classes, respectively. We then compute the shared information between these spaces and remove it from the forget space to isolate class-discriminatory feature space. Finally, we obtain the unlearned model by updating the weights to suppress the class discriminatory features from the activation spaces. We demonstrate our algorithm's efficacy on ImageNet using a Vision Transformer with only $\sim 1.5\%$ drop in retain accuracy compared to the original model while maintaining under $1\%$ accuracy on the unlearned class samples. Furthermore, our algorithm exhibits competitive unlearning performance and resilience against Membership Inference Attacks (MIA). Compared to baselines, it achieves an average accuracy improvement of $1.38\%$ on the ImageNet dataset while requiring up to $10 \times$ fewer samples for unlearning. Additionally, under stronger MIA attacks on the CIFAR-100 dataset using a ResNet18 architecture, our approach outperforms the best baseline by $1.8\%$. Our code is available at https://github.com/sangamesh-kodge/class_forgetting.

Deep Unlearning: Fast and Efficient Gradient-free Approach to Class Forgetting

TL;DR

This work introduces a novel class unlearning algorithm designed to strategically eliminate specific classes from the learned model that exhibits competitive unlearning performance and resilience against Membership Inference Attacks (MIA).

Abstract

Machine unlearning is a prominent and challenging field, driven by regulatory demands for user data deletion and heightened privacy awareness. Existing approaches involve retraining model or multiple finetuning steps for each deletion request, often constrained by computational limits and restricted data access. In this work, we introduce a novel class unlearning algorithm designed to strategically eliminate specific classes from the learned model. Our algorithm first estimates the Retain and the Forget Spaces using Singular Value Decomposition on the layerwise activations for a small subset of samples from the retain and unlearn classes, respectively. We then compute the shared information between these spaces and remove it from the forget space to isolate class-discriminatory feature space. Finally, we obtain the unlearned model by updating the weights to suppress the class discriminatory features from the activation spaces. We demonstrate our algorithm's efficacy on ImageNet using a Vision Transformer with only drop in retain accuracy compared to the original model while maintaining under accuracy on the unlearned class samples. Furthermore, our algorithm exhibits competitive unlearning performance and resilience against Membership Inference Attacks (MIA). Compared to baselines, it achieves an average accuracy improvement of on the ImageNet dataset while requiring up to fewer samples for unlearning. Additionally, under stronger MIA attacks on the CIFAR-100 dataset using a ResNet18 architecture, our approach outperforms the best baseline by . Our code is available at https://github.com/sangamesh-kodge/class_forgetting.
Paper Structure (32 sections, 7 equations, 15 figures, 9 tables, 3 algorithms)

This paper contains 32 sections, 7 equations, 15 figures, 9 tables, 3 algorithms.

Figures (15)

  • Figure 1: Illustration of the unlearning algorithm on a simple 4 class classification problem. Figure shows the decision boundary for (a) original model, (b) our unlearned model redistributing the space to nearby classes and (c) retrained model without red class.
  • Figure 2: Membership Inference Attack.
  • Figure 3: U-LIRA hayes2024inexact Membership Inference Attack for CIFAR10 dataset on ResNet18 model.
  • Figure 4: Effect of varying (a) $\alpha_r$ and (b) $\alpha_f$ for Cat class of CIFAR10 dataset on VGG11 network.
  • Figure 5: Layer-wise weight change for VGG11 on CIFAR10 dataset.
  • ...and 10 more figures