Table of Contents
Fetching ...

Attack and Reset for Unlearning: Exploiting Adversarial Noise toward Machine Unlearning through Parameter Re-initialization

Yoonhwa Jung, Ikhyun Cho, Shun-Hsiang Hsu, Julia Hockenmaier

TL;DR

This work tackles the challenge of machine unlearning under a task-agnostic setting, focusing on forgetting specific instances containing personal data while preserving the model’s original task performance. It introduces Attack-and-Reset for Unlearning (ARU), a three-stage framework that uses carefully crafted adversarial noise to reveal features biased toward the forget set and then resets the corresponding convolutional filters via parameter re-initialization, followed by fine-tuning on retain data. ARU demonstrates state-of-the-art performance on MUFAC and MUCAC benchmarks, achieving superior NoMUS scores by efficiently combining adversarially guided parameter masking with targeted reinitialization, without sacrificing utility. The method offers a practical, scalable pathway toward robust unlearning in deep networks, highlighting the potential of adversarial perturbations as a tool for selective parameter reconfiguration and faster convergence during retraining. Overall, ARU provides a data-driven, computationally efficient alternative to retraining or class-specific unlearning, with broad implications for privacy-preserving ML deployments.

Abstract

With growing concerns surrounding privacy and regulatory compliance, the concept of machine unlearning has gained prominence, aiming to selectively forget or erase specific learned information from a trained model. In response to this critical need, we introduce a novel approach called Attack-and-Reset for Unlearning (ARU). This algorithm leverages meticulously crafted adversarial noise to generate a parameter mask, effectively resetting certain parameters and rendering them unlearnable. ARU outperforms current state-of-the-art results on two facial machine-unlearning benchmark datasets, MUFAC and MUCAC. In particular, we present the steps involved in attacking and masking that strategically filter and re-initialize network parameters biased towards the forget set. Our work represents a significant advancement in rendering data unexploitable to deep learning models through parameter re-initialization, achieved by harnessing adversarial noise to craft a mask.

Attack and Reset for Unlearning: Exploiting Adversarial Noise toward Machine Unlearning through Parameter Re-initialization

TL;DR

This work tackles the challenge of machine unlearning under a task-agnostic setting, focusing on forgetting specific instances containing personal data while preserving the model’s original task performance. It introduces Attack-and-Reset for Unlearning (ARU), a three-stage framework that uses carefully crafted adversarial noise to reveal features biased toward the forget set and then resets the corresponding convolutional filters via parameter re-initialization, followed by fine-tuning on retain data. ARU demonstrates state-of-the-art performance on MUFAC and MUCAC benchmarks, achieving superior NoMUS scores by efficiently combining adversarially guided parameter masking with targeted reinitialization, without sacrificing utility. The method offers a practical, scalable pathway toward robust unlearning in deep networks, highlighting the potential of adversarial perturbations as a tool for selective parameter reconfiguration and faster convergence during retraining. Overall, ARU provides a data-driven, computationally efficient alternative to retraining or class-specific unlearning, with broad implications for privacy-preserving ML deployments.

Abstract

With growing concerns surrounding privacy and regulatory compliance, the concept of machine unlearning has gained prominence, aiming to selectively forget or erase specific learned information from a trained model. In response to this critical need, we introduce a novel approach called Attack-and-Reset for Unlearning (ARU). This algorithm leverages meticulously crafted adversarial noise to generate a parameter mask, effectively resetting certain parameters and rendering them unlearnable. ARU outperforms current state-of-the-art results on two facial machine-unlearning benchmark datasets, MUFAC and MUCAC. In particular, we present the steps involved in attacking and masking that strategically filter and re-initialize network parameters biased towards the forget set. Our work represents a significant advancement in rendering data unexploitable to deep learning models through parameter re-initialization, achieved by harnessing adversarial noise to craft a mask.
Paper Structure (36 sections, 5 equations, 4 figures, 3 tables)

This paper contains 36 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: ARU: An overview of our proposed parameter masking for model re-initialization.
  • Figure 2: Comparison of Forgetting vs. Utility among various unlearning models and ARU. An ideal model would exhibit a complete forgetting score (i.e., 0 Forgetting score) while maintaining a utility score equivalent to a fine-tuned model on the retain data. ARU shows performance closest to the ideal model.
  • Figure 3: Comparison between adversarial noise and random noise. Aligning our intuition, adversarial noise captures low-level information from the raw image, delineating facial and hair outlines, and background.
  • Figure 4: Feature Map Comparison between the Original Model and Unlearned Model via ARU on an Unseen Data Sample