Table of Contents
Fetching ...

Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images

Jacopo Bonato, Marco Cotogni, Luigi Sabetta

TL;DR

The paper tackles privacy-preserving unlearning without access to a retain set and introduces SCAR, a model-agnostic method that combines metric learning with a distillation-trick to erase forget-set information while preserving test performance. It leverages the Mahalanobis distance $d_M$ to relocate forget samples to the nearest non-forget class distribution and uses a surrogate out-of-distribution dataset with Jensen-Shannon divergence $d_{JS}$ to transfer knowledge from the original model to the unlearning model. The authors also propose SCAR Self-forget, enabling class-removal without forget data, and demonstrate competitive performance across CR and HR settings on CIFAR and TinyImagenet, with architectural-agnostic results and thorough ablations. Overall, SCAR offers a retain-set-free, architecture-agnostic approach to approximate unlearning with practical implications for privacy and data rights, while outlining limitations and directions for certifiability research.

Abstract

In this paper, we introduce Selective-distillation for Class and Architecture-agnostic unleaRning (SCAR), a novel approximate unlearning method. SCAR efficiently eliminates specific information while preserving the model's test accuracy without using a retain set, which is a key component in state-of-the-art approximate unlearning algorithms. Our approach utilizes a modified Mahalanobis distance to guide the unlearning of the feature vectors of the instances to be forgotten, aligning them to the nearest wrong class distribution. Moreover, we propose a distillation-trick mechanism that distills the knowledge of the original model into the unlearning model with out-of-distribution images for retaining the original model's test performance without using any retain set. Importantly, we propose a self-forget version of SCAR that unlearns without having access to the forget set. We experimentally verified the effectiveness of our method, on three public datasets, comparing it with state-of-the-art methods. Our method obtains performance higher than methods that operate without the retain set and comparable w.r.t the best methods that rely on the retain set.

Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images

TL;DR

The paper tackles privacy-preserving unlearning without access to a retain set and introduces SCAR, a model-agnostic method that combines metric learning with a distillation-trick to erase forget-set information while preserving test performance. It leverages the Mahalanobis distance to relocate forget samples to the nearest non-forget class distribution and uses a surrogate out-of-distribution dataset with Jensen-Shannon divergence to transfer knowledge from the original model to the unlearning model. The authors also propose SCAR Self-forget, enabling class-removal without forget data, and demonstrate competitive performance across CR and HR settings on CIFAR and TinyImagenet, with architectural-agnostic results and thorough ablations. Overall, SCAR offers a retain-set-free, architecture-agnostic approach to approximate unlearning with practical implications for privacy and data rights, while outlining limitations and directions for certifiability research.

Abstract

In this paper, we introduce Selective-distillation for Class and Architecture-agnostic unleaRning (SCAR), a novel approximate unlearning method. SCAR efficiently eliminates specific information while preserving the model's test accuracy without using a retain set, which is a key component in state-of-the-art approximate unlearning algorithms. Our approach utilizes a modified Mahalanobis distance to guide the unlearning of the feature vectors of the instances to be forgotten, aligning them to the nearest wrong class distribution. Moreover, we propose a distillation-trick mechanism that distills the knowledge of the original model into the unlearning model with out-of-distribution images for retaining the original model's test performance without using any retain set. Importantly, we propose a self-forget version of SCAR that unlearns without having access to the forget set. We experimentally verified the effectiveness of our method, on three public datasets, comparing it with state-of-the-art methods. Our method obtains performance higher than methods that operate without the retain set and comparable w.r.t the best methods that rely on the retain set.
Paper Structure (24 sections, 9 equations, 6 figures, 11 tables)

This paper contains 24 sections, 9 equations, 6 figures, 11 tables.

Figures (6)

  • Figure 1: SCAR overview. A)SCAR scheme B)distillation-trick through J.S. divergence of the logits of $\Phi_\theta(x_j)$ and $\Phi^U_{\theta}(x_j)$. C) Representation of the feature vector distributions of samples for 3 classes before (left) and after (right) the unlearning process. Ellipses denote the $95\%$ confidence intervals of the distribution of samples.
  • Figure 2: A-B) TSNE plots of feature vectors for the first 20 classes in CIFAR100 (A, $\mathcal{D}$) and the surrogate dataset subset Imagenet (B, $\mathcal{D^{\text{sur}}}$)
  • Figure 3: Scheme of SCAR self-forget in CR. The surrogate dataset $\mathcal{D}^\text{sur}$ supplies during the unlearning procedure both the surrogates $\mathcal{D}_r^\text{sur}$ and $\mathcal{D}_f^\text{sur}$.
  • Figure 4: Values of final AUS for SCAR and SCAR self-forget as a function of the number of samples available in $\mathcal{D}^{\text{sur}}$ (subset of Imagenet1K). AUS is reported as mean $\pm$ std over ten runs
  • Figure 5: Examples of train loss (left) and train set and test set accuracies (right) of a resnet18 trained with the trick-distillation mechanism on the subset of Imagenet1K.
  • ...and 1 more figures