Table of Contents
Fetching ...

DeepClean: Machine Unlearning on the Cheap by Resetting Privacy Sensitive Weights using the Fisher Diagonal

Jiaeli Shi, Najah Ghalyan, Kostis Gourgoulias, John Buford, Sean Moran

TL;DR

This work tackles the privacy challenge of retroactively forgetting sensitive data from trained models without full retraining. It introduces DeepClean, which uses the diagonal Fisher Information Matrix computed on two data splits, $D_f$ and $D_r$, to compute $r(w_i)=I_{D_f}(w_i)/I_{D_r}(w_i)$ and identify a small subset of weights for retraining, with those weights updated while the rest are frozen. By initializing the forget-weight subset to zero and fine-tuning on the retain data with a fixed threshold $\gamma$, DeepClean achieves forgetting of $D_f$ while preserving accuracy on $D_r$, outperforming several influence-function and Fisher-based baselines across CNNs on MNIST and CIFAR datasets. The results demonstrate a practical, model-agnostic, and efficient unlearning paradigm that mitigates privacy risks in deployed models without costly full retraining, marking the diagonal-FIM approach as a viable tool for real-world data governance.

Abstract

Machine learning models trained on sensitive or private data can inadvertently memorize and leak that information. Machine unlearning seeks to retroactively remove such details from model weights to protect privacy. We contribute a lightweight unlearning algorithm that leverages the Fisher Information Matrix (FIM) for selective forgetting. Prior work in this area requires full retraining or large matrix inversions, which are computationally expensive. Our key insight is that the diagonal elements of the FIM, which measure the sensitivity of log-likelihood to changes in weights, contain sufficient information for effective forgetting. Specifically, we compute the FIM diagonal over two subsets -- the data to retain and forget -- for all trainable weights. This diagonal representation approximates the complete FIM while dramatically reducing computation. We then use it to selectively update weights to maximize forgetting of the sensitive subset while minimizing impact on the retained subset. Experiments show that our algorithm can successfully forget any randomly selected subsets of training data across neural network architectures. By leveraging the FIM diagonal, our approach provides an interpretable, lightweight, and efficient solution for machine unlearning with practical privacy benefits.

DeepClean: Machine Unlearning on the Cheap by Resetting Privacy Sensitive Weights using the Fisher Diagonal

TL;DR

This work tackles the privacy challenge of retroactively forgetting sensitive data from trained models without full retraining. It introduces DeepClean, which uses the diagonal Fisher Information Matrix computed on two data splits, and , to compute and identify a small subset of weights for retraining, with those weights updated while the rest are frozen. By initializing the forget-weight subset to zero and fine-tuning on the retain data with a fixed threshold , DeepClean achieves forgetting of while preserving accuracy on , outperforming several influence-function and Fisher-based baselines across CNNs on MNIST and CIFAR datasets. The results demonstrate a practical, model-agnostic, and efficient unlearning paradigm that mitigates privacy risks in deployed models without costly full retraining, marking the diagonal-FIM approach as a viable tool for real-world data governance.

Abstract

Machine learning models trained on sensitive or private data can inadvertently memorize and leak that information. Machine unlearning seeks to retroactively remove such details from model weights to protect privacy. We contribute a lightweight unlearning algorithm that leverages the Fisher Information Matrix (FIM) for selective forgetting. Prior work in this area requires full retraining or large matrix inversions, which are computationally expensive. Our key insight is that the diagonal elements of the FIM, which measure the sensitivity of log-likelihood to changes in weights, contain sufficient information for effective forgetting. Specifically, we compute the FIM diagonal over two subsets -- the data to retain and forget -- for all trainable weights. This diagonal representation approximates the complete FIM while dramatically reducing computation. We then use it to selectively update weights to maximize forgetting of the sensitive subset while minimizing impact on the retained subset. Experiments show that our algorithm can successfully forget any randomly selected subsets of training data across neural network architectures. By leveraging the FIM diagonal, our approach provides an interpretable, lightweight, and efficient solution for machine unlearning with practical privacy benefits.
Paper Structure (6 sections, 2 equations, 4 figures, 4 tables)

This paper contains 6 sections, 2 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of DeepClean, a new baseline for machine unlearning. While conceptually simple, DeepClean is computationally efficient and empirically effective (see below).
  • Figure 2: Unlearning quality comparison between DeepClean and competitive unlearning algorithms (see Sec. \ref{['sec:experiments']}). Accuracy and MIA should close to the Gold model. Unlearning time should be short.
  • Figure 3: Number of ${\textit{D}_\textit{r}}$ important weights vs. $\gamma$ for two unlearning scenarios on Cifar-10 and VGG-16. The threshold $\gamma$ controls how much of the model we will have to update to forget the influence of $D_f$. The range from $2$ to $3$ indicates potential sweet spots. Taking $\gamma$ close to $0$ leads to having to update most of the model. For both unlearning scenarios, $\gamma{=}2$ gives good $\Delta \textit{MIA}$ performance.
  • Figure 4: Utility and unlearning tasks' performance span across $\gamma$ range 2 to 3. The number of ${\textit{D}_\textit{r}}$ important weights decrease at an increasing pace within this range