Table of Contents
Fetching ...

Unlearning Evaluation through Subset Statistical Independence

Chenhao Zhang, Muxing Li, Feng Liu, Weitong Chen, Miao Xu

TL;DR

This work designs a tailored use of the Hilbert-Schmidt Independence Criterion to assess whether the model outputs on a given subset exhibit statistical dependence, without requiring model retraining or auxiliary classifiers, without requiring model retraining or auxiliary classifiers.

Abstract

Evaluating machine unlearning remains challenging, as existing methods typically require retraining reference models or performing membership inference attacks, both of which rely on prior access to training configuration or supervision labels, making them impractical in realistic scenarios. Motivated by the fact that most unlearning algorithms remove a small, random subset of the training data, we propose a subset-level evaluation framework based on statistical independence. Specifically, we design a tailored use of the Hilbert-Schmidt Independence Criterion to assess whether the model outputs on a given subset exhibit statistical dependence, without requiring model retraining or auxiliary classifiers. Our method provides a simple, standalone evaluation procedure that aligns with unlearning workflows. Extensive experiments demonstrate that our approach reliably distinguishes in-training from out-of-training subsets and clearly differentiates unlearning effectiveness, even when existing evaluations fall short.

Unlearning Evaluation through Subset Statistical Independence

TL;DR

This work designs a tailored use of the Hilbert-Schmidt Independence Criterion to assess whether the model outputs on a given subset exhibit statistical dependence, without requiring model retraining or auxiliary classifiers, without requiring model retraining or auxiliary classifiers.

Abstract

Evaluating machine unlearning remains challenging, as existing methods typically require retraining reference models or performing membership inference attacks, both of which rely on prior access to training configuration or supervision labels, making them impractical in realistic scenarios. Motivated by the fact that most unlearning algorithms remove a small, random subset of the training data, we propose a subset-level evaluation framework based on statistical independence. Specifically, we design a tailored use of the Hilbert-Schmidt Independence Criterion to assess whether the model outputs on a given subset exhibit statistical dependence, without requiring model retraining or auxiliary classifiers. Our method provides a simple, standalone evaluation procedure that aligns with unlearning workflows. Extensive experiments demonstrate that our approach reliably distinguishes in-training from out-of-training subsets and clearly differentiates unlearning effectiveness, even when existing evaluations fall short.
Paper Structure (48 sections, 34 equations, 7 figures, 10 tables, 3 algorithms)

This paper contains 48 sections, 34 equations, 7 figures, 10 tables, 3 algorithms.

Figures (7)

  • Figure 1: Empirical $H(S, h)$ distributions calculated on in-training subset $\mathcal{S}_{\rm IT}$ and out-of-training subset $\mathcal{S}_{\rm OOT}$ using two models: (left) trained model $h^{\rm or}$ and (right) randomly initialized model $h^{\rm rand}$.
  • Figure 2: F1 score of SDE across training progress. The performance of retrained models at checkpoints is in Appendix \ref{['sec:training_details']}.
  • Figure 3: Effect of kernel bandwidth $\sigma$ on F1 score. The solid lines show F1 score trends across different $\sigma$ values. Vertical dashed lines indicate the heuristic of $\sigma=\sqrt{\rm dim}$, while dotted lines correspond to the median heuristic.
  • Figure 4: F1 score on activations of different intermediate layers. The x-axis labels from left to right ($h$ to $h_1$) represent layers from output to input. The reason for the missing bar (e.g., the green bar at $h_2$) is further discussed in Section \ref{['sec:discuss']}.
  • Figure 5: $H(S,h)$ distribution changes along the training epoch. The dashed gray line is the $p=0.01$ baseline.
  • ...and 2 more figures