Table of Contents
Fetching ...

Leveraging Per-Instance Privacy for Machine Unlearning

Nazanin Mohammadi Sepahvand, Anvith Thudi, Berivan Isik, Ashmita Bhattacharyya, Nicolas Papernot, Eleni Triantafillou, Daniel M. Roy, Gintare Karolina Dziugaite

TL;DR

The paper develops per-instance $\,D_{\alpha}$-Rényi privacy losses to quantify the difficulty of unlearning individual datapoints in neural networks, addressing the inefficiency of retraining from scratch. By integrating Langevin unlearning theory with per-instance privacy analysis, it derives data-dependent bounds on the number of unlearning steps $k$ needed to forget a datapoint, and shows that $k$ scales with the per-instance loss $P(x,4\alpha)$, while an irreducible term reflects distance to stationarity. Empirically, the authors validate that per-instance privacy losses predict unlearning difficulty across SGLD and standard fine-tuning, and that these losses correlate with, yet outperform, traditional data-difficulty proxies and loss-barrier metrics. They also demonstrate that data points with higher privacy losses correspond to larger loss barriers along linear paths in weight space, providing a geometric interpretation of unlearning difficulty. Overall, the work lays a foundation for adaptive, data-aware unlearning strategies and suggests avenues for integrating per-instance privacy losses into practical unlearning pipelines and proxy metrics.

Abstract

We present a principled, per-instance approach to quantifying the difficulty of unlearning via fine-tuning. We begin by sharpening an analysis of noisy gradient descent for unlearning (Chien et al., 2024), obtaining a better utility-unlearning tradeoff by replacing worst-case privacy loss bounds with per-instance privacy losses (Thudi et al., 2024), each of which bounds the (Renyi) divergence to retraining without an individual data point. To demonstrate the practical applicability of our theory, we present empirical results showing that our theoretical predictions are born out both for Stochastic Gradient Langevin Dynamics (SGLD) as well as for standard fine-tuning without explicit noise. We further demonstrate that per-instance privacy losses correlate well with several existing data difficulty metrics, while also identifying harder groups of data points, and introduce novel evaluation methods based on loss barriers. All together, our findings provide a foundation for more efficient and adaptive unlearning strategies tailored to the unique properties of individual data points.

Leveraging Per-Instance Privacy for Machine Unlearning

TL;DR

The paper develops per-instance -Rényi privacy losses to quantify the difficulty of unlearning individual datapoints in neural networks, addressing the inefficiency of retraining from scratch. By integrating Langevin unlearning theory with per-instance privacy analysis, it derives data-dependent bounds on the number of unlearning steps needed to forget a datapoint, and shows that scales with the per-instance loss , while an irreducible term reflects distance to stationarity. Empirically, the authors validate that per-instance privacy losses predict unlearning difficulty across SGLD and standard fine-tuning, and that these losses correlate with, yet outperform, traditional data-difficulty proxies and loss-barrier metrics. They also demonstrate that data points with higher privacy losses correspond to larger loss barriers along linear paths in weight space, providing a geometric interpretation of unlearning difficulty. Overall, the work lays a foundation for adaptive, data-aware unlearning strategies and suggests avenues for integrating per-instance privacy losses into practical unlearning pipelines and proxy metrics.

Abstract

We present a principled, per-instance approach to quantifying the difficulty of unlearning via fine-tuning. We begin by sharpening an analysis of noisy gradient descent for unlearning (Chien et al., 2024), obtaining a better utility-unlearning tradeoff by replacing worst-case privacy loss bounds with per-instance privacy losses (Thudi et al., 2024), each of which bounds the (Renyi) divergence to retraining without an individual data point. To demonstrate the practical applicability of our theory, we present empirical results showing that our theoretical predictions are born out both for Stochastic Gradient Langevin Dynamics (SGLD) as well as for standard fine-tuning without explicit noise. We further demonstrate that per-instance privacy losses correlate well with several existing data difficulty metrics, while also identifying harder groups of data points, and introduce novel evaluation methods based on loss barriers. All together, our findings provide a foundation for more efficient and adaptive unlearning strategies tailored to the unique properties of individual data points.

Paper Structure

This paper contains 49 sections, 4 theorems, 14 equations, 9 figures.

Key Result

Corollary 4.1

Fix $\mathcal{D}$, $\mathcal{D}'$. Let $\{\varepsilon_\alpha,\varepsilon'_\alpha\}_{\alpha \ge 1}$ satisfy $\max\{D_{\alpha}(\nu_{\mathcal{D}'}\|\nu_{T,\mathcal{D}'}),D_{\alpha}(\nu_{T,\mathcal{D}'}\|\nu_{\mathcal{D}'})\}\leq \varepsilon_\alpha$ and $D_{\alpha}(\nu_{T,\mathcal{D}} \| \nu_{T,\mathcal

Figures (9)

  • Figure 1: CIFAR-10 dataset results. Left: SGLD unlearning with varying levels of noise ($\sigma$). Forget set difficulty (x-axis), as measured by the privacy loss, against time to unlearn (y-axis). Time to unlearn is measured in terms of epochs needed to get within 5% of the unlearning metric (e.g., UA or MIA) measured on the oracle model. Middle: SGD unlearning. Time to unlearn measured across three evaluation metrics. Right: Error barrier between the oracle and the unlearned model before and after unlearning for forget sets with different privacy losses. Baseline corresponds to the loss barrier between two oracles.
  • Figure 2: Correlation between privacy losses (x-axis) and various proxy metrics (y-axis). The values of all proxy metrics are normalized to their maximum value for better visual clarity. For improved readability, the data is binned into 30 bins.
  • Figure 3: Comparison of the time needed to unlearn (y-axis) the most difficult forget sets as identified by privacy losses (ours), C-Proxy, average gradient norm and EL2N, across different forget set sizes (x-axis).
  • Figure 4: We compared our estimates of the group privacy guarantees (y-axis) across forget sets determined by rankings of privacy losses (x-axis), and found the group privacy guarantees did not change. This was despite these forget sets leading to consistent differences in the number of steps to unlearn. We report the mean over $20$ estimates of the group privacy values, and one standard deviation. We conclude the theory for group unlearning is currently not sharp enough to capture trends seen in practice.
  • Figure 5: Unlearning results for accuracy metrics (top) and MIA success rate (bottom). The x-axis represents the number of epochs. In each plot, lines of different colors represent forget sets of varying difficulty, while the dashed line indicates the oracle's performance.
  • ...and 4 more figures

Theorems & Definitions (10)

  • Definition 3.1
  • Definition 3.2: Rényi Unlearning
  • Corollary 4.1
  • Definition 4.2: Per-Instance Privacy Loss
  • Theorem 4.3
  • proof
  • Corollary 4.4
  • proof
  • Theorem 2.1
  • proof