Table of Contents
Fetching ...

Partially Blinded Unlearning: Class Unlearning for Deep Networks a Bayesian Perspective

Subhodip Panda, Shashwat Sourav, Prathosh A. P

TL;DR

This paper tackles the problem of removing information related to a specific data class from pre-trained deep classifiers under privacy constraints. It introduces Partially-Blinded Unlearning (PBU), a Bayesian, single-step method that optimizes a loss combining the unlearned-data likelihood and stability regularization via the Fisher Information matrix and an $\ell_2$ term toward the initial parameters $\theta^*$. Crucially, PBU operates with access only to the unlearning data $\mathcal{S}_n$, avoiding full dataset access and expensive retraining. Empirical results on MNIST, CIFAR-10/100, and FOOD-101 across ResNet-18/34/50 and All-CNN show that PBU reduces forgetting in the unlearned class while preserving or improving retained-class performance, with membership inference attack risk kept below random chance and a predictable, one-shot computational footprint. Ablation studies confirm the stability regularizer's role in maintaining retained-class accuracy, underscoring PBU's practicality for privacy-preserving unlearning in real-world deployments.

Abstract

In order to adhere to regulatory standards governing individual data privacy and safety, machine learning models must systematically eliminate information derived from specific subsets of a user's training data that can no longer be utilized. The emerging discipline of Machine Unlearning has arisen as a pivotal area of research, facilitating the process of selectively discarding information designated to specific sets or classes of data from a pre-trained model, thereby eliminating the necessity for extensive retraining from scratch. The principal aim of this study is to formulate a methodology tailored for the purposeful elimination of information linked to a specific class of data from a pre-trained classification network. This intentional removal is crafted to degrade the model's performance specifically concerning the unlearned data class while concurrently minimizing any detrimental impacts on the model's performance in other classes. To achieve this goal, we frame the class unlearning problem from a Bayesian perspective, which yields a loss function that minimizes the log-likelihood associated with the unlearned data with a stability regularization in parameter space. This stability regularization incorporates Mohalanobis distance with respect to the Fisher Information matrix and $l_2$ distance from the pre-trained model parameters. Our novel approach, termed \textbf{Partially-Blinded Unlearning (PBU)}, surpasses existing state-of-the-art class unlearning methods, demonstrating superior effectiveness. Notably, PBU achieves this efficacy without requiring awareness of the entire training dataset but only to the unlearned data points, marking a distinctive feature of its performance.

Partially Blinded Unlearning: Class Unlearning for Deep Networks a Bayesian Perspective

TL;DR

This paper tackles the problem of removing information related to a specific data class from pre-trained deep classifiers under privacy constraints. It introduces Partially-Blinded Unlearning (PBU), a Bayesian, single-step method that optimizes a loss combining the unlearned-data likelihood and stability regularization via the Fisher Information matrix and an term toward the initial parameters . Crucially, PBU operates with access only to the unlearning data , avoiding full dataset access and expensive retraining. Empirical results on MNIST, CIFAR-10/100, and FOOD-101 across ResNet-18/34/50 and All-CNN show that PBU reduces forgetting in the unlearned class while preserving or improving retained-class performance, with membership inference attack risk kept below random chance and a predictable, one-shot computational footprint. Ablation studies confirm the stability regularizer's role in maintaining retained-class accuracy, underscoring PBU's practicality for privacy-preserving unlearning in real-world deployments.

Abstract

In order to adhere to regulatory standards governing individual data privacy and safety, machine learning models must systematically eliminate information derived from specific subsets of a user's training data that can no longer be utilized. The emerging discipline of Machine Unlearning has arisen as a pivotal area of research, facilitating the process of selectively discarding information designated to specific sets or classes of data from a pre-trained model, thereby eliminating the necessity for extensive retraining from scratch. The principal aim of this study is to formulate a methodology tailored for the purposeful elimination of information linked to a specific class of data from a pre-trained classification network. This intentional removal is crafted to degrade the model's performance specifically concerning the unlearned data class while concurrently minimizing any detrimental impacts on the model's performance in other classes. To achieve this goal, we frame the class unlearning problem from a Bayesian perspective, which yields a loss function that minimizes the log-likelihood associated with the unlearned data with a stability regularization in parameter space. This stability regularization incorporates Mohalanobis distance with respect to the Fisher Information matrix and distance from the pre-trained model parameters. Our novel approach, termed \textbf{Partially-Blinded Unlearning (PBU)}, surpasses existing state-of-the-art class unlearning methods, demonstrating superior effectiveness. Notably, PBU achieves this efficacy without requiring awareness of the entire training dataset but only to the unlearned data points, marking a distinctive feature of its performance.
Paper Structure (17 sections, 4 theorems, 15 equations, 2 figures, 4 tables, 1 algorithm)

This paper contains 17 sections, 4 theorems, 15 equations, 2 figures, 4 tables, 1 algorithm.

Key Result

theorem thmcountertheorem

The proof is can be found in the cited text.(Book chapters moulin-veravalliLemma-13.1) If the regularity conditions-(reg-1)-(reg-4) hold then $\forall \theta \in \Theta$ the Fisher Information matrix can be written in the following form

Figures (2)

  • Figure 1: Partially Blinded Unlearning (PBU) Method: Given user-identified samples to be unlearned ($\mathcal{S}_n$), our unlearning method employs a two component perturbation technique, indicated by a loss function comprising three terms: the first term(shown in the bottom half) represents the perturbation in the output space, aiming to minimize the log-likelihood associated with the unlearned class while the last two terms correspond to perturbations in the parameter space (shown in the upper half), including the Mahalanobis Distance with respect to the Fisher Information matrix and the $l_2$ distance.
  • Figure 2: Unlearning Time comparison of our method with the Fast Yet Effective and Bad Teaching approach.

Theorems & Definitions (9)

  • definition thmcounterdefinition
  • theorem thmcountertheorem
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof
  • theorem thmcountertheorem
  • proof
  • remark thmcounterremark