Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images

Jacopo Bonato; Marco Cotogni; Luigi Sabetta

Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images

Jacopo Bonato, Marco Cotogni, Luigi Sabetta

TL;DR

The paper tackles privacy-preserving unlearning without access to a retain set and introduces SCAR, a model-agnostic method that combines metric learning with a distillation-trick to erase forget-set information while preserving test performance. It leverages the Mahalanobis distance $d_M$ to relocate forget samples to the nearest non-forget class distribution and uses a surrogate out-of-distribution dataset with Jensen-Shannon divergence $d_{JS}$ to transfer knowledge from the original model to the unlearning model. The authors also propose SCAR Self-forget, enabling class-removal without forget data, and demonstrate competitive performance across CR and HR settings on CIFAR and TinyImagenet, with architectural-agnostic results and thorough ablations. Overall, SCAR offers a retain-set-free, architecture-agnostic approach to approximate unlearning with practical implications for privacy and data rights, while outlining limitations and directions for certifiability research.

Abstract

In this paper, we introduce Selective-distillation for Class and Architecture-agnostic unleaRning (SCAR), a novel approximate unlearning method. SCAR efficiently eliminates specific information while preserving the model's test accuracy without using a retain set, which is a key component in state-of-the-art approximate unlearning algorithms. Our approach utilizes a modified Mahalanobis distance to guide the unlearning of the feature vectors of the instances to be forgotten, aligning them to the nearest wrong class distribution. Moreover, we propose a distillation-trick mechanism that distills the knowledge of the original model into the unlearning model with out-of-distribution images for retaining the original model's test performance without using any retain set. Importantly, we propose a self-forget version of SCAR that unlearns without having access to the forget set. We experimentally verified the effectiveness of our method, on three public datasets, comparing it with state-of-the-art methods. Our method obtains performance higher than methods that operate without the retain set and comparable w.r.t the best methods that rely on the retain set.

Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images

TL;DR

to relocate forget samples to the nearest non-forget class distribution and uses a surrogate out-of-distribution dataset with Jensen-Shannon divergence

to transfer knowledge from the original model to the unlearning model. The authors also propose SCAR Self-forget, enabling class-removal without forget data, and demonstrate competitive performance across CR and HR settings on CIFAR and TinyImagenet, with architectural-agnostic results and thorough ablations. Overall, SCAR offers a retain-set-free, architecture-agnostic approach to approximate unlearning with practical implications for privacy and data rights, while outlining limitations and directions for certifiability research.

Abstract

Paper Structure (24 sections, 9 equations, 6 figures, 11 tables)

This paper contains 24 sections, 9 equations, 6 figures, 11 tables.

Introduction
Related Works
Methods
Preliminaries
SCAR
SCAR Self-forget
Experimental Results
Impact of different Datasets as $\mathcal{D}^\text{sur}$
Comparison with sota methods
Ablation Study
Analysis on the effect of different measures in the metric learning mechanism of SCAR
Analysis of the impact of the dimension of $\mathcal{D}^\text{sur}$
Architectural-Agnostic Results
Conclusions
Additional Analyses
...and 9 more sections

Figures (6)

Figure 1: SCAR overview. A)SCAR scheme B)distillation-trick through J.S. divergence of the logits of $\Phi_\theta(x_j)$ and $\Phi^U_{\theta}(x_j)$. C) Representation of the feature vector distributions of samples for 3 classes before (left) and after (right) the unlearning process. Ellipses denote the $95\%$ confidence intervals of the distribution of samples.
Figure 2: A-B) TSNE plots of feature vectors for the first 20 classes in CIFAR100 (A, $\mathcal{D}$) and the surrogate dataset subset Imagenet (B, $\mathcal{D^{\text{sur}}}$)
Figure 3: Scheme of SCAR self-forget in CR. The surrogate dataset $\mathcal{D}^\text{sur}$ supplies during the unlearning procedure both the surrogates $\mathcal{D}_r^\text{sur}$ and $\mathcal{D}_f^\text{sur}$.
Figure 4: Values of final AUS for SCAR and SCAR self-forget as a function of the number of samples available in $\mathcal{D}^{\text{sur}}$ (subset of Imagenet1K). AUS is reported as mean $\pm$ std over ten runs
Figure 5: Examples of train loss (left) and train set and test set accuracies (right) of a resnet18 trained with the trick-distillation mechanism on the subset of Imagenet1K.
...and 1 more figures

Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images

TL;DR

Abstract

Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images

Authors

TL;DR

Abstract

Table of Contents

Figures (6)