Unlearning via Sparse Representations

Vedant Shah; Frederik Träuble; Ashish Malik; Hugo Larochelle; Michael Mozer; Sanjeev Arora; Yoshua Bengio; Anirudh Goyal

Unlearning via Sparse Representations

Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael Mozer, Sanjeev Arora, Yoshua Bengio, Anirudh Goyal

TL;DR

The paper tackles the unlearning problem under practical compute constraints by introducing a Discrete Key-Value Bottleneck (DKVB) that yields sparse, localized representations. It proposes two zero-shot unlearning methods—Unlearning via Activations and Unlearning via Examples—that remove information about a forget class by masking selected key–value pairs, without retraining. Across CIFAR-10, CIFAR-100, LACUNA-100, and ImageNet-1k, and using backbones like CLIP ViT-B/32 and ResNet-50, the approach achieves complete forget-class unlearning while preserving retain-class performance and exhibits substantial FLOPs reductions compared to SCRUB. The results demonstrate that in-built sparsity assists robust, compute-efficient unlearning with practical applicability to large-scale models, while also outlining limitations and avenues for future work in end-to-end sparse training and selective unlearning scenarios.

Abstract

Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible by existing techniques. We propose a nearly compute-free zero-shot unlearning technique based on a discrete representational bottleneck. We show that the proposed technique efficiently unlearns the forget set and incurs negligible damage to the model's performance on the rest of the data set. We evaluate the proposed technique on the problem of \textit{class unlearning} using three datasets: CIFAR-10, CIFAR-100, and LACUNA-100. We compare the proposed technique to SCRUB, a state-of-the-art approach which uses knowledge distillation for unlearning. Across all three datasets, the proposed technique performs as well as, if not better than SCRUB while incurring almost no computational cost.

Unlearning via Sparse Representations

TL;DR

Abstract

Paper Structure (24 sections, 7 figures, 10 tables)

This paper contains 24 sections, 7 figures, 10 tables.

Introduction
Related Work
Background and Notations
Unlearning via Sparse Representations
Experiments and Results
Experimental Setup
Unlearning via the Discrete Key-Value Bottleneck
Comparison with Baselines
Proposed Methods achieve Unlearning in a Compute Efficient Manner
Limitations and Future Work
Conclusion
Acknowledgements
Appendix
Initial Performances of the Models
Deciding the Forget Class
...and 9 more sections

Figures (7)

Figure 1: A summary of the proposed unlearning approach. Left: The structure of a key-value bottleneck. The encoder is frozen and pre-trained and $R_1$ is a random projection matrix. The values corresponding to the selected keys are retrieved to be used by the decoder. The gradient is backpropagated through the decoder into the values during training. The figure depicts the case with 1 codebook in the DKVB. However, in practice we use multiple codebooks. Center: Examples from the forget set are passed through the trained model and the key-value pairs selected during the forward pass are recorded. Right: The recorded key-value pairs are then masked from the bottleneck. As a result, the key selection is redirected to other keys, with non-informative corresponding values leading to other prediction.
Figure 2: Unlearning via Activations. Performance on the retain set test data vs. Performance on the forget set test data across various datasets for (a) CLIP pretrained ViT/B-32 in the top row (b) ImageNet pretrained ResNet-50 backbones in the bottom row as the value of $N_a$ is increased which is indicated by the color of the markers. The relative performance on the retain set test data as compared to the original models increases after unlearning in the case of CIFAR-10 and ImageNet-1k and drops for CIFAR-100 and LACUNA-100 in the case of ViT/B-32 and increases for all four datasets in the case of ResNet-50 (see Table \ref{['tab:baseline_compare']}).
Figure 3: Unlearning via Examples. Performance on the retain set test data vs. Performance on the forget set test data across different datasets for (a) CLIP pretrained ViT/B-32 in the top row and (b) ImageNet pretrained ResNet-50 backbones in the bottom row as the value of $N_e$ is increased which is indicated by the color of the markers. The relative performance on the retain set test data as compared to the base model increases for CIFAR-10 and drops for all other datasets in the case of ViT/B-32, whereas it drops for CIFAR-10 and CIFAR-100 and increases for LACUNA-100 and ImageNet-1k in the case of ResNet-50 (see Table \ref{['tab:baseline_compare']})
Figure 4: Number of mis-classifications per class for the test data. The red bars correspond to the class with the least number of mis-classifications (a) CIFAR-10: Class 1 has the least number of mis-classifications (b) CIFAR-100: Class 58 has the least number of mis-classifications (c) LACUNA-100: Classes 34, 48, 65, 76, 82 and 85 have 0 mis-classifications and hence, do not have a bar
Figure 5: Multi Class Unlearning for CIFAR-10
...and 2 more figures

Unlearning via Sparse Representations

TL;DR

Abstract

Unlearning via Sparse Representations

Authors

TL;DR

Abstract

Table of Contents

Figures (7)