Table of Contents
Fetching ...

When unlearning is free: leveraging low influence points to reduce computational costs

Anat Kleiman, Robert Fisher, Ben Deaner, Udi Wieder

TL;DR

This work tackles data privacy by reducing the cost of unlearning through pre-filtering the forget and retain sets using influence scores. It develops a theory-backed, algorithm-agnostic framework that identifies low-impact training points via approximate influence functions (Hessian-based, LESS, and Lowest Gradients) and validates it across vision and language tasks. Empirically, removing these low-influence points preserves model privacy (MIA) and accuracy while cutting unlearning time by up to about 50% in real-world scenarios. The approach demonstrates practical, cross-domain applicability and supports efficient, privacy-preserving data removal in deployed ML systems.

Abstract

As concerns around data privacy in machine learning grow, the ability to unlearn, or remove, specific data points from trained models becomes increasingly important. While state of the art unlearning methods have emerged in response, they typically treat all points in the forget set equally. In this work, we challenge this approach by asking whether points that have a negligible impact on the model's learning need to be removed. Through a comparative analysis of influence functions across language and vision tasks, we identify subsets of training data with negligible impact on model outputs. Leveraging this insight, we propose an efficient unlearning framework that reduces the size of datasets before unlearning leading to significant computational savings (up to approximately 50 percent) on real world empirical examples.

When unlearning is free: leveraging low influence points to reduce computational costs

TL;DR

This work tackles data privacy by reducing the cost of unlearning through pre-filtering the forget and retain sets using influence scores. It develops a theory-backed, algorithm-agnostic framework that identifies low-impact training points via approximate influence functions (Hessian-based, LESS, and Lowest Gradients) and validates it across vision and language tasks. Empirically, removing these low-influence points preserves model privacy (MIA) and accuracy while cutting unlearning time by up to about 50% in real-world scenarios. The approach demonstrates practical, cross-domain applicability and supports efficient, privacy-preserving data removal in deployed ML systems.

Abstract

As concerns around data privacy in machine learning grow, the ability to unlearn, or remove, specific data points from trained models becomes increasingly important. While state of the art unlearning methods have emerged in response, they typically treat all points in the forget set equally. In this work, we challenge this approach by asking whether points that have a negligible impact on the model's learning need to be removed. Through a comparative analysis of influence functions across language and vision tasks, we identify subsets of training data with negligible impact on model outputs. Leveraging this insight, we propose an efficient unlearning framework that reduces the size of datasets before unlearning leading to significant computational savings (up to approximately 50 percent) on real world empirical examples.

Paper Structure

This paper contains 34 sections, 2 theorems, 31 equations, 13 figures, 2 tables, 1 algorithm.

Key Result

Theorem 3.1

Suppose that for $\alpha$ sufficiently close to $1$, that the function $w\mapsto \sum_{i\in S_{train}}\alpha_{i}\ell(w;z_{i})$ is strictly convex and twice differentiable. It follows that,

Figures (13)

  • Figure 4.1: The model is retrained on all but the n lowest influence points (x-axis) from image datasets CIFAR-10 (left) and CIFAR-100 (right) and the final model accuracy on the removed points is recorded (y-axis). All influence methods are able to remove up to $\sim$2,000 points with minimal effects on accuracy, before this drops rapidly. The LESS and Hessian methods using self-influence consistently outperform the others.
  • Figure 4.2: Using a setup similar to \ref{['unlearning_curve10_cifar']}, the model is now retrained on all but the n lowest influence points from language dataset SQuAD. Accuracy drops steadily for most methods up to $\sim$10,000 points removed. Furthermore, LESS and Hessian with self influence still outperform other methods.
  • Figure 5.1: Proportions of low influence points are removed from the CIFAR-10 forget (left) and retain (right) sets using the influence scores on CIFAR-10. These are then used in the Rank 1 unlearning algorithm. The final unlearned model's accuracy on the original set is recorded (y-axis) and compared to the total execution time (seconds) of executing unlearning (x-axis). Accuracy on the original sets remains about the same (with variations in runs), while execution time decreases as a larger proportion of points are removed before unlearning. Note that Bottom 60% in the left graph and Full Retain in the right graph are the same point as we continue to remove points from the retain set after removing from the forget set.
  • Figure 5.2: Proportions of low influence points are removed from the CIFAR-100 forget and retain sets before running the Rank 1 (left) and Rank 2 (right) algorithms. We compare only removing points from the forget set (Full retain and x% forgetting) to removing points from both sets simultaneously. We track performance on both the original forget and retain sets to ensure maintained performance in both.
  • Figure B.1: We remove 40 points from the Food-101 training set that have at least $k=1$ neighbors with a cosine similarity of $\geq c$ (x-axis), and retrain a ViT model which we then test on the removed points (y-axis). We compare this to randomly removing 40 points with varying cosine similarity neighbors (Random) and testing accuracy on these points with a retrained model. As can be seen, the accuracy of the retrained model increases for removed points with higher cosine similarity with remaining neighbors.
  • ...and 8 more figures

Theorems & Definitions (4)

  • Theorem 3.1
  • Theorem 3.2
  • proof : Proof of Theorem 3.1
  • proof : Proof of Theorem 3.2