Table of Contents
Fetching ...

Instance-Level Difficulty: A Missing Perspective in Machine Unlearning

Hammad Rizwan, Mahtab Sarvmaili, Hassan Sajjad, Ga Wu

TL;DR

The paper shows that machine unlearning is not uniformly feasible across individual data points; by defining a ground-truth harmonic score and evaluating six candidate factors across multiple algorithms and datasets, it finds that four factors consistently relate to unlearning difficulty, while several commonly used indices fail to predict per-sample outcomes. The work demonstrates the need for instance-level evaluation and proposes that post-unlearning performance is a strong practical indicator of difficulty, urging the development of a unified predictive index. It also contrasts data-centric and model-centric perspectives to understand unlearning challenges and calls for more nuanced metrics and methods to support real-world Right-to-be-Forgotten requirements.

Abstract

Current research on deep machine unlearning primarily focuses on improving or evaluating the overall effectiveness of unlearning methods while overlooking the varying difficulty of unlearning individual training samples. As a result, the broader feasibility of machine unlearning remains under-explored. This paper studies the cruxes that make machine unlearning difficult through a thorough instance-level unlearning performance analysis over various unlearning algorithms and datasets. In particular, we summarize four factors that make unlearning a data point difficult, and we empirically show that these factors are independent of a specific unlearning algorithm but only relevant to the target model and its training data. Given these findings, we argue that machine unlearning research should pay attention to the instance-level difficulty of unlearning.

Instance-Level Difficulty: A Missing Perspective in Machine Unlearning

TL;DR

The paper shows that machine unlearning is not uniformly feasible across individual data points; by defining a ground-truth harmonic score and evaluating six candidate factors across multiple algorithms and datasets, it finds that four factors consistently relate to unlearning difficulty, while several commonly used indices fail to predict per-sample outcomes. The work demonstrates the need for instance-level evaluation and proposes that post-unlearning performance is a strong practical indicator of difficulty, urging the development of a unified predictive index. It also contrasts data-centric and model-centric perspectives to understand unlearning challenges and calls for more nuanced metrics and methods to support real-world Right-to-be-Forgotten requirements.

Abstract

Current research on deep machine unlearning primarily focuses on improving or evaluating the overall effectiveness of unlearning methods while overlooking the varying difficulty of unlearning individual training samples. As a result, the broader feasibility of machine unlearning remains under-explored. This paper studies the cruxes that make machine unlearning difficult through a thorough instance-level unlearning performance analysis over various unlearning algorithms and datasets. In particular, we summarize four factors that make unlearning a data point difficult, and we empirically show that these factors are independent of a specific unlearning algorithm but only relevant to the target model and its training data. Given these findings, we argue that machine unlearning research should pay attention to the instance-level difficulty of unlearning.
Paper Structure (28 sections, 8 equations, 22 figures, 2 tables)

This paper contains 28 sections, 8 equations, 22 figures, 2 tables.

Figures (22)

  • Figure 1: Effectiveness of using Tolerance of Preference Shift (TPS) as an index of unlearning difficulty. For the four different unlearning algorithms, we note there is a consistent positive alignment between TPS and empirical unlearning outcome (GT). The experiments are conducted on ResNet-18 model trained on SVHN dataset.
  • Figure 2: Effectiveness of using Distance of Preference Shift (DPS) as an index of unlearning difficulty. For the tree out of four different unlearning algorithms, we observe a negative alignment between DPS and empirical unlearning outcome (GT). For SalUn, there is no clear correlation between DPS and GT. The experiments are conducted on ResNet-18 model trained on SVHN dataset.
  • Figure 3: Effectiveness of using Geometric Distance to Decision Boundary (GDDB) as an index of unlearning difficulty. (Top) Distance to decision boundary estimated through DeepFool in adversarial learning literature. (Bottom) Distance to the decision boundary is estimated by treating the last layer of a neural network as a linear classifier. There is no observable correlation between empirical unlearning difficulty and training data's geometric distance to the decision boundary. The experiments is conducted on ResNet-18 model trained on SVHN dataset.
  • Figure 4: Effectiveness of using Number of Unlearning Epochs (NUE) as an index of unlearning difficulty. We observed noisy negative alignment between NUE and GT for gradient based approaches. For SCRUB and SalUn, there is no observable correlation. The experiments are conducted on ResNet-18 model trained on SVHN dataset.
  • Figure 5: Effectiveness of MIA as index of unlearning difficulty. A prediction of "1" indicates successful unlearning, where the unlearned model no longer retains information about the data, while "0" signifies failed unlearning, where the model still remembers the unlearned samples. We observed noisy positive alignment between MIA and GT for gradient based approaches. The larger the GT, the easier the data point for unlearning and this is positively associated with MIA="1". For SCRUB there is no observable correlation. The experiments are conducted on ResNet-18 model trained on SVHN dataset.
  • ...and 17 more figures