The Utility and Complexity of in- and out-of-Distribution Machine Unlearning
Youssef Allouah, Joshua Kazdan, Rachid Guerraoui, Sanmi Koyejo
TL;DR
This work formalizes approximate machine unlearning via $(q,\varepsilon)$-Rényi guarantees and separates in-distribution from out-of-distribution forget data. It proves that a simple noisy ERM procedure with output perturbation achieves dimension-free deletion capacity for in-distribution forget data, yielding near-linear time/space and a sharp separation from differential privacy. For out-of-distribution forget data, it introduces a robust gradient method based on coordinate-wise trimmed means that amortizes unlearning time with interpolation-error-driven bounds, achieving near-linear time and constant-factor deletion capacity under strong conditions. Together, these results provide theoretical certainties for practical unlearning under privacy and robustness constraints and identify key directions for future unified upper bounds and extensions to richer models.
Abstract
Machine unlearning, the process of selectively removing data from trained models, is increasingly crucial for addressing privacy concerns and knowledge gaps post-deployment. Despite this importance, existing approaches are often heuristic and lack formal guarantees. In this paper, we analyze the fundamental utility, time, and space complexity trade-offs of approximate unlearning, providing rigorous certification analogous to differential privacy. For in-distribution forget data -- data similar to the retain set -- we show that a surprisingly simple and general procedure, empirical risk minimization with output perturbation, achieves tight unlearning-utility-complexity trade-offs, addressing a previous theoretical gap on the separation from unlearning "for free" via differential privacy, which inherently facilitates the removal of such data. However, such techniques fail with out-of-distribution forget data -- data significantly different from the retain set -- where unlearning time complexity can exceed that of retraining, even for a single sample. To address this, we propose a new robust and noisy gradient descent variant that provably amortizes unlearning time complexity without compromising utility.
