Certified Data Removal from Machine Learning Models
Chuan Guo, Tom Goldstein, Awni Hannun, Laurens van der Maaten
TL;DR
The paper defines certified removal (CR) as a rigorous indistinguishability guarantee between a model trained with some data removed and one that never observed that data, inspired by differential privacy. It introduces a practical CR mechanism for $L_2$-regularized linear models based on a one-step Newton update to attenuate the removed data's influence, combined with loss perturbation to mask the resulting gradient residuals. The authors derive both worst-case and data-dependent bounds on the gradient residual and extend the approach to batch removals and online computation, demonstrating that removal can be orders of magnitude cheaper than retraining. Through experiments on MNIST, LSUN, and SST, as well as DP-based feature extractors, they show that substantial numbers of removals are possible with acceptable accuracy loss, and that CR can be effectively integrated with public data or differentially private components to yield practical, scalable data-removal guarantees.
Abstract
Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. Is it possible to "remove" data from a machine-learning model? We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with. We develop a certified-removal mechanism for linear classifiers and empirically study learning settings in which this mechanism is practical.
