Table of Contents
Fetching ...

Hessian-Free Online Certified Unlearning

Xinbao Qiao, Meng Zhang, Ming Tang, Ermin Wei

TL;DR

This work introduces Hessian-Free Online Certified Unlearning (HF-OCU), a method to forget data from trained models with certified guarantees without performing Hessian inversions. By maintaining per-sample impact statistics computed through an affine stochastic recursion and Hessian-vector products, HF-OCU enables near-instantaneous online data removal via simple vector additions, while preserving or improving unlearning and generalization guarantees compared to Hessian-based baselines. Theoretical bounds show improved unlearning error and generalization behavior under non-convex settings, and experiments demonstrate millisecond-level unlearning, reduced precomputation/storage costs, and robust performance across convex and non-convex models, with open-source code provided. The approach also analyzes privacy implications (MIA-L/MIA-U) and demonstrates how calibrated noise can defend against membership inference while maintaining utility, highlighting practical significance for rights-to-forget in modern ML systems.

Abstract

Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data. Recent advances suggest pre-computing and storing statistics extracted from second-order information and implementing unlearning through Newton-style updates. However, the Hessian matrix operations are extremely costly and previous works conduct unlearning for empirical risk minimizer with the convexity assumption, precluding their applicability to high-dimensional over-parameterized models and the nonconvergence condition. In this paper, we propose an efficient Hessian-free unlearning approach. The key idea is to maintain a statistical vector for each training data, computed through affine stochastic recursion of the difference between the retrained and learned models. We prove that our proposed method outperforms the state-of-the-art methods in terms of the unlearning and generalization guarantees, the deletion capacity, and the time/storage complexity, under the same regularity conditions. Through the strategy of recollecting statistics for removing data, we develop an online unlearning algorithm that achieves near-instantaneous data removal, as it requires only vector addition. Experiments demonstrate that our proposed scheme surpasses existing results by orders of magnitude in terms of time/storage costs with millisecond-level unlearning execution, while also enhancing test accuracy.

Hessian-Free Online Certified Unlearning

TL;DR

This work introduces Hessian-Free Online Certified Unlearning (HF-OCU), a method to forget data from trained models with certified guarantees without performing Hessian inversions. By maintaining per-sample impact statistics computed through an affine stochastic recursion and Hessian-vector products, HF-OCU enables near-instantaneous online data removal via simple vector additions, while preserving or improving unlearning and generalization guarantees compared to Hessian-based baselines. Theoretical bounds show improved unlearning error and generalization behavior under non-convex settings, and experiments demonstrate millisecond-level unlearning, reduced precomputation/storage costs, and robust performance across convex and non-convex models, with open-source code provided. The approach also analyzes privacy implications (MIA-L/MIA-U) and demonstrates how calibrated noise can defend against membership inference while maintaining utility, highlighting practical significance for rights-to-forget in modern ML systems.

Abstract

Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data. Recent advances suggest pre-computing and storing statistics extracted from second-order information and implementing unlearning through Newton-style updates. However, the Hessian matrix operations are extremely costly and previous works conduct unlearning for empirical risk minimizer with the convexity assumption, precluding their applicability to high-dimensional over-parameterized models and the nonconvergence condition. In this paper, we propose an efficient Hessian-free unlearning approach. The key idea is to maintain a statistical vector for each training data, computed through affine stochastic recursion of the difference between the retrained and learned models. We prove that our proposed method outperforms the state-of-the-art methods in terms of the unlearning and generalization guarantees, the deletion capacity, and the time/storage complexity, under the same regularity conditions. Through the strategy of recollecting statistics for removing data, we develop an online unlearning algorithm that achieves near-instantaneous data removal, as it requires only vector addition. Experiments demonstrate that our proposed scheme surpasses existing results by orders of magnitude in terms of time/storage costs with millisecond-level unlearning execution, while also enhancing test accuracy.
Paper Structure (37 sections, 10 theorems, 77 equations, 12 figures, 8 tables, 3 algorithms)

This paper contains 37 sections, 10 theorems, 77 equations, 12 figures, 8 tables, 3 algorithms.

Key Result

Theorem 1

When $m$ sequential deletion requests arrive, the sum of $m$ approximators is equivalent to performing batch deletion simultaneously. Colloquially, for $u_1, \ldots, u_m$ sequence of continuously arriving deletion requests, we demonstrate that $\mathbf{a}_{E,B}^{-\sum_{j=1}^m u_j}=\sum_{j=1}^m \math

Figures (12)

  • Figure 1: Verification experiments I on (a) LR and (b) CNN, respectively. The left of (a)(b) shows the distance to the retrained model, i.e. $\|\mathbf{w}_{E,B}^{-U} \!-\! \mathbf{w}_{E,B}\!\!-\! \mathbf{a}^{-U}\|$. The right of (a)(b) shows the correlation between the approximate loss change and actual loss change on the forgetting dataset.
  • Figure 2: Membership Inference Attack I. The first plot shows the MIA-L and error bars show 95% confidence intervals; the second and third plots depict the MIA-U results; and the fourth plot illustrates the privacy-utility tradeoff. Proposed method successfully defends MIA-L and mitigates excessive privacy leakage from MIA-U without sacrificing performance when applied with appropriate noise.
  • Figure 3: Unlearning-Repairing Strategy. The blue, red, and green lines respectively represent the learning, unlearning, and repairing process. When a significant amount of data is removed, leading to a loss in model performance, performing fine-tuning on the remaining dataset and recording the approximators during this period can help avoid performance degradation.
  • Figure 4: Verification experiments II. Evaluation shows a comparison between the norm of approximate parameter change $\|\mathbf{a}^{-U} \|$ and norm of exact parameter change $\|\mathbf{w}_{E,B}^{-U} - \mathbf{w}_{E,B}\|$ across different random seeds. Intuitively, the NS and IJ methods are contingent on the selection of forgetting data. In contrast, our approach consistently approximates the actual values effectively.
  • Figure 5: Membership Inference Attack II. The AUC values of MIA-U at different deletion rates. The feature construction method employed in MIA-U is DirectDiff, with error bars representing 95% confidence intervals. The target model in the first plot is LR, while the second plot corresponds to a CNN. The findings show that lower deletion rates lead to privacy leakage, whereas higher deletion rates diminish MIA-U's attack performance, bringing it closer to random guessing.
  • ...and 7 more figures

Theorems & Definitions (22)

  • Definition 1: $(\epsilon,\delta)$-certified unlearning
  • Theorem 1: Additivity
  • Lemma 2
  • Lemma 3
  • Theorem 4: Unlearning Guarantee
  • Corollary 5
  • Theorem 6: Generalization Guarantee
  • Definition 2: Definition of Deletion Capacity (DBLP:conf/nips/SekhariAKS21)
  • Theorem 7: Upper Bound of Deletion Capacity DBLP:conf/nips/SekhariAKS21
  • Theorem 8: Deletion Capacity Guarantee
  • ...and 12 more