Table of Contents
Fetching ...

Certified Data Removal from Machine Learning Models

Chuan Guo, Tom Goldstein, Awni Hannun, Laurens van der Maaten

TL;DR

The paper defines certified removal (CR) as a rigorous indistinguishability guarantee between a model trained with some data removed and one that never observed that data, inspired by differential privacy. It introduces a practical CR mechanism for $L_2$-regularized linear models based on a one-step Newton update to attenuate the removed data's influence, combined with loss perturbation to mask the resulting gradient residuals. The authors derive both worst-case and data-dependent bounds on the gradient residual and extend the approach to batch removals and online computation, demonstrating that removal can be orders of magnitude cheaper than retraining. Through experiments on MNIST, LSUN, and SST, as well as DP-based feature extractors, they show that substantial numbers of removals are possible with acceptable accuracy loss, and that CR can be effectively integrated with public data or differentially private components to yield practical, scalable data-removal guarantees.

Abstract

Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. Is it possible to "remove" data from a machine-learning model? We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with. We develop a certified-removal mechanism for linear classifiers and empirically study learning settings in which this mechanism is practical.

Certified Data Removal from Machine Learning Models

TL;DR

The paper defines certified removal (CR) as a rigorous indistinguishability guarantee between a model trained with some data removed and one that never observed that data, inspired by differential privacy. It introduces a practical CR mechanism for -regularized linear models based on a one-step Newton update to attenuate the removed data's influence, combined with loss perturbation to mask the resulting gradient residuals. The authors derive both worst-case and data-dependent bounds on the gradient residual and extend the approach to batch removals and online computation, demonstrating that removal can be orders of magnitude cheaper than retraining. Through experiments on MNIST, LSUN, and SST, as well as DP-based feature extractors, they show that substantial numbers of removals are possible with acceptable accuracy loss, and that CR can be effectively integrated with public data or differentially private components to yield practical, scalable data-removal guarantees.

Abstract

Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. Is it possible to "remove" data from a machine-learning model? We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with. We develop a certified-removal mechanism for linear classifiers and empirically study learning settings in which this mechanism is practical.

Paper Structure

This paper contains 29 sections, 12 theorems, 38 equations, 5 figures, 1 table, 2 algorithms.

Key Result

Theorem 1

Suppose that $\forall (\mathbf{x}_i, y_i) \in \mathcal{D}, \mathbf{w} \in \mathbb{R}^d: \|\nabla \ell(\mathbf{w}^\top \mathbf{x}_i, y_i)\|_2 \leq C$. Suppose also that $\ell"$ is $\gamma$-Lipschitz and $\| \mathbf{x}_i \|_2 \leq 1$ for all $(\mathbf{x}_i, y_i) \in \mathcal{D}$. Then: where $H_{\mathbf{w}_\eta}$ denotes the Hessian of $L(\cdot; \mathcal{D}')$ at the parameter vector $\mathbf{w}_\e

Figures (5)

  • Figure 1: Linear logistic regression on MNIST.Left: Effect of $L_2$-regularization parameter, $\lambda$, and standard deviation of the objective perturbation, $\sigma$, on test accuracy. Middle: Effect of $\epsilon$ on test accuracy when supporting 100 removals. Right: Trade-off between accuracy and supported number of removals at $\epsilon=1$. At a given $\epsilon$, higher $\lambda$ and $\sigma$ values reduce test accuracy but allow for many more removals.
  • Figure 2: Linear logistic regression on MNIST. Gradient residual norm (on log scale) as a function of the number of removals.
  • Figure 3: MNIST training digits sorted by norm of the removal update $\mathbf{\| H_{\mathbf{w}^*}^{-1} \Delta \|_2}$. The samples with the highest norm (top) appear to be atypical, making it harder to undo their effect on the model. The samples with the lowest norm (bottom) are prototypical 3s and 8s, and hence are much easier to remove.
  • Figure 4: Linear models trained on public feature extractors. Trade-off between test accuracy and the expected number of supported removals (at $\epsilon\!=\!1$) on LSUN (left) and SST (right). The setting of $(\lambda, \sigma)$ is shown next to each point. The number of supported removals rapidly increases when accuracy is slightly sacrificed.
  • Figure 5: Using $\mathbf{\epsilon}$-DP features. Trade-off between $\epsilon$ and test accuracy on SVHN of models that support 10 removals. Dashed line shows non-private model accuracy.

Theorems & Definitions (17)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Corollary 1
  • Theorem 4
  • Corollary 2
  • Theorem 5
  • Theorem 1
  • proof
  • Theorem 2
  • ...and 7 more