Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

Alvin Heng; Harold Soh

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

Alvin Heng, Harold Soh

TL;DR

Selective Amnesia (SA) reframes forgetting as a Bayesian continual learning problem to selectively erase specific concepts from pretrained conditional generative models without access to the original training data. It unifies Elastic Weight Consolidation and Generative Replay into a single objective and introduces a surrogate objective that uses a configurable $q(\boldsymbol{x}|\boldsymbol{c}_f)$ to bound forgetting via the difference between $\log p(\boldsymbol{x}|\theta,\boldsymbol{c})$ and its surrogate. The method applies to conditional VAEs and diffusion models, with demonstrations on MNIST, CIFAR10, STL10, and Stable Diffusion, showing forgetting of discrete classes, celebrities, and nudity while preserving recall for other concepts. The work advances safe usage by enabling targeted, controllable forgetting and discusses limitations, extensions, and broader implications.

Abstract

The recent proliferation of large-scale text-to-image models has led to growing concerns that such models may be misused to generate harmful, misleading, and inappropriate content. Motivated by this issue, we derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models. Our method, dubbed Selective Amnesia, enables controllable forgetting where a user can specify how a concept should be forgotten. Selective Amnesia can be applied to conditional variational likelihood models, which encompass a variety of popular deep generative frameworks, including variational autoencoders and large-scale text-to-image diffusion models. Experiments across different models demonstrate that our approach induces forgetting on a variety of concepts, from entire classes in standard datasets to celebrity and nudity prompts in text-to-image models. Our code is publicly available at https://github.com/clear-nus/selective-amnesia.

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

TL;DR

to bound forgetting via the difference between

and its surrogate. The method applies to conditional VAEs and diffusion models, with demonstrations on MNIST, CIFAR10, STL10, and Stable Diffusion, showing forgetting of discrete classes, celebrities, and nudity while preserving recall for other concepts. The work advances safe usage by enabling targeted, controllable forgetting and discusses limitations, extensions, and broader implications.

Abstract

Paper Structure (46 sections, 6 theorems, 19 equations, 19 figures, 4 tables)

This paper contains 46 sections, 6 theorems, 19 equations, 19 figures, 4 tables.

Introduction
Background and Related Work
Variational Generative Models
Conditional Variational Autoencoders.
Conditional Diffusion Models.
Continual Learning
Elastic Weight Consolidation.
Generative Replay.
Data Forgetting
Editing and Unlearning in Generative Models
Concept Erasure in Text-to-Image Models
Proposed Method: Selective Amnesia
Problem Statement.
A Bayesian Continual Learning Approach to Forgetting.
Generative Replay Over $D_r$
...and 31 more sections

Key Result

Theorem 1

Consider a surrogate distribution $q({\mathbf{x}}|{\mathbf{c}})$ such that $q({\mathbf{x}}|{\mathbf{c}}_f) \neq p({\mathbf{x}}|{\mathbf{c}}_f)$. Assume we have access to the MLE optimum for the full dataset $\theta^* = \mathop{\mathrm{arg\,max}}\limits_\theta \mathbb{E}_{p({\mathbf{x}}, {\mathbf{c}}

Figures (19)

Figure 1: Qualitative results of our method, Selective Amnesia (SA). SA can be applied to a variety of models, from forgetting textual prompts such as specific celebrities or nudity in text-to-image models to discrete classes in VAEs and diffusion models (DDPM).
Figure 2: Illustration of training VAE to forget the MNIST digit 0. The 'original' column shows the baseline samples generated by the VAE. In the 'naive' column, we train the VAE to optimize Eq. \ref{['eq:min_ll_obj']} with $D_f$ being the '0' class, while in the 'ours' column we train using the modified objective Eq. \ref{['eq:surrogate_obj']}
Figure 3: Qualitative results of SA applied to forgetting famous persons. Within each column, the leftmost image represents SD v1.4 samples, the middle image represents SA with $q({\mathbf{x}}|{\mathbf{c}}_f)$ set to "middle aged man/woman" and the rightmost image is SA with $q({\mathbf{x}}|{\mathbf{c}}_f)$ set to "male/female clown". [...] is substituted with either "Brad Pitt" or "Angelina Jolie".
Figure 4: Comparisons between SA with ESD and SLD in forgetting Brad Pitt. We use SA with $q({\mathbf{x}}|{\mathbf{c}}_f)$ set to "middle aged man". Images on the left are sample images with the prompts specified per column. Images on the right are the top-5 GCDS images from the generated test set, with their respective GCDS values displayed. Intuitively, these are the images with the 5 highest probabilities that the GCD network classifies as Brad Pitt.
Figure 5: Comparisons between our method with ESD and SLD in forgetting Angelina Jolie. We use the variant of SA with $q({\mathbf{x}}|{\mathbf{c}}_f)$ set to "middle aged woman". Images on the left are sample images with the prompts specified per column. Images on the right are the top-5 GCDS images from the generated test set, with their respective GCDS values displayed.
...and 14 more figures

Theorems & Definitions (10)

Theorem 1
Corollary 1
Lemma 1
proof
Lemma 2
proof
Theorem 1
proof
Corollary 1
proof

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

TL;DR

Abstract

Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (19)

Theorems & Definitions (10)