Selective Amnesia: A Continual Learning Approach to Forgetting in Deep Generative Models
Alvin Heng, Harold Soh
TL;DR
Selective Amnesia (SA) reframes forgetting as a Bayesian continual learning problem to selectively erase specific concepts from pretrained conditional generative models without access to the original training data. It unifies Elastic Weight Consolidation and Generative Replay into a single objective and introduces a surrogate objective that uses a configurable $q(\boldsymbol{x}|\boldsymbol{c}_f)$ to bound forgetting via the difference between $\log p(\boldsymbol{x}|\theta,\boldsymbol{c})$ and its surrogate. The method applies to conditional VAEs and diffusion models, with demonstrations on MNIST, CIFAR10, STL10, and Stable Diffusion, showing forgetting of discrete classes, celebrities, and nudity while preserving recall for other concepts. The work advances safe usage by enabling targeted, controllable forgetting and discusses limitations, extensions, and broader implications.
Abstract
The recent proliferation of large-scale text-to-image models has led to growing concerns that such models may be misused to generate harmful, misleading, and inappropriate content. Motivated by this issue, we derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models. Our method, dubbed Selective Amnesia, enables controllable forgetting where a user can specify how a concept should be forgotten. Selective Amnesia can be applied to conditional variational likelihood models, which encompass a variety of popular deep generative frameworks, including variational autoencoders and large-scale text-to-image diffusion models. Experiments across different models demonstrate that our approach induces forgetting on a variety of concepts, from entire classes in standard datasets to celebrity and nudity prompts in text-to-image models. Our code is publicly available at https://github.com/clear-nus/selective-amnesia.
