Towards Certified Unlearning for Deep Neural Networks

Binchi Zhang; Yushun Dong; Tianhao Wang; Jundong Li

Towards Certified Unlearning for Deep Neural Networks

Binchi Zhang, Yushun Dong, Tianhao Wang, Jundong Li

TL;DR

This work advances machine unlearning by extending certified unlearning to deep neural networks, addressing nonconvexity, nonconvergence, and sequential unlearning. It introduces two key ideas: (i) a modified Newton update with local convex approximation and a norm constraint to bound the inverse Hessian, and (ii) an efficient inverse-Hessian estimator via LiSSA to compute a tractable update for the retrained model. The final unlearned model is formed as $\mathcal{F}(\bm{w}^*,\mathcal{D}_u,\mathcal{D})=\bm{w}^*+\frac{n_u}{(n-n_u)H}\tilde{\bm{H}}^{-1}_{s,\lambda}\nabla\mathcal{L}(\bm{w}^*,\mathcal{D}_u)$, enabling $\varepsilon-\delta$ certified unlearning when paired with Gaussian noise of variance $\sigma^2 \ge (\Delta/\varepsilon)\sqrt{2\ln(1.25/\delta)}$. The paper also addresses nonconvergence and sequential unlearning, providing theoretical bounds and practical guidance, and validates the approach with extensive experiments showing strong unlearning efficacy and substantial efficiency gains over retraining. Overall, the work offers a principled, scalable path to certifiable forgetting in nonconvex neural models with real-world applicability.

Abstract

In the field of machine unlearning, certified unlearning has been extensively studied in convex machine learning models due to its high efficiency and strong theoretical guarantees. However, its application to deep neural networks (DNNs), known for their highly nonconvex nature, still poses challenges. To bridge the gap between certified unlearning and DNNs, we propose several simple techniques to extend certified unlearning methods to nonconvex objectives. To reduce the time complexity, we develop an efficient computation method by inverse Hessian approximation without compromising certification guarantees. In addition, we extend our discussion of certification to nonconvergence training and sequential unlearning, considering that real-world users can send unlearning requests at different time points. Extensive experiments on three real-world datasets demonstrate the efficacy of our method and the advantages of certified unlearning in DNNs.

Towards Certified Unlearning for Deep Neural Networks

TL;DR

, enabling

certified unlearning when paired with Gaussian noise of variance

. The paper also addresses nonconvergence and sequential unlearning, providing theoretical bounds and practical guidance, and validates the approach with extensive experiments showing strong unlearning efficacy and substantial efficiency gains over retraining. Overall, the work offers a principled, scalable path to certifiable forgetting in nonconvex neural models with real-world applicability.

Abstract

Paper Structure (32 sections, 9 theorems, 40 equations, 4 figures, 6 tables, 2 algorithms)

This paper contains 32 sections, 9 theorems, 40 equations, 4 figures, 6 tables, 2 algorithms.

Introduction
Certified Unlearning
Methodology
Practical Consideration
Experiments
Dataset Information
Baseline Information
Implementation
Unlearning Performance
Efficiency
Ablation Study
Sequential Unlearning
Parameter Study
Related Works
Exact Unlearning
...and 17 more sections

Key Result

Proposition 2.2

If learning algorithm $\mathcal{A}$ provides $\varepsilon-\delta$ differential privacy, $\mathcal{A}(\mathcal{D})$ is an $\varepsilon-\delta$ certified unlearned model.

Figures (4)

Figure 1: Illustration of certified unlearning, where the first step is to estimate the retrained model based on the original model, and the second step is to add noise to it. According to \ref{['thm:certified unlearning']}, we can guarantee the difference in distributions between the unlearned model and the retrained model is bounded by certification budgets.
Figure 2: Comparison of unlearning time between the certified unlearning method and unlearning baselines over three popular DNNs across three datasets.
Figure 3: Gradient norm, approximation error bound, and model utility after each unlearning step.
Figure 4: The effect of local convex coefficient $\lambda$ and certification budget $\varepsilon$ and $\delta$ over the MLP backbone on MNIST.

Theorems & Definitions (19)

Definition 2.1
Proposition 2.2
Theorem 2.3
Lemma 3.3
Theorem 3.4
Proposition 3.5
Theorem 3.6
Proposition 4.1
Proposition 4.2
proof
...and 9 more

Towards Certified Unlearning for Deep Neural Networks

TL;DR

Abstract

Towards Certified Unlearning for Deep Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (19)