Towards Certified Unlearning for Deep Neural Networks
Binchi Zhang, Yushun Dong, Tianhao Wang, Jundong Li
TL;DR
This work advances machine unlearning by extending certified unlearning to deep neural networks, addressing nonconvexity, nonconvergence, and sequential unlearning. It introduces two key ideas: (i) a modified Newton update with local convex approximation and a norm constraint to bound the inverse Hessian, and (ii) an efficient inverse-Hessian estimator via LiSSA to compute a tractable update for the retrained model. The final unlearned model is formed as $\mathcal{F}(\bm{w}^*,\mathcal{D}_u,\mathcal{D})=\bm{w}^*+\frac{n_u}{(n-n_u)H}\tilde{\bm{H}}^{-1}_{s,\lambda}\nabla\mathcal{L}(\bm{w}^*,\mathcal{D}_u)$, enabling $\varepsilon-\delta$ certified unlearning when paired with Gaussian noise of variance $\sigma^2 \ge (\Delta/\varepsilon)\sqrt{2\ln(1.25/\delta)}$. The paper also addresses nonconvergence and sequential unlearning, providing theoretical bounds and practical guidance, and validates the approach with extensive experiments showing strong unlearning efficacy and substantial efficiency gains over retraining. Overall, the work offers a principled, scalable path to certifiable forgetting in nonconvex neural models with real-world applicability.
Abstract
In the field of machine unlearning, certified unlearning has been extensively studied in convex machine learning models due to its high efficiency and strong theoretical guarantees. However, its application to deep neural networks (DNNs), known for their highly nonconvex nature, still poses challenges. To bridge the gap between certified unlearning and DNNs, we propose several simple techniques to extend certified unlearning methods to nonconvex objectives. To reduce the time complexity, we develop an efficient computation method by inverse Hessian approximation without compromising certification guarantees. In addition, we extend our discussion of certification to nonconvergence training and sequential unlearning, considering that real-world users can send unlearning requests at different time points. Extensive experiments on three real-world datasets demonstrate the efficacy of our method and the advantages of certified unlearning in DNNs.
