Adversarial Mixup Unlearning
Zhuoyi Peng, Yixuan Tang, Yi Yang
TL;DR
This paper tackles catastrophic unlearning, where erasing targeted data from a trained model can inadvertently degrade performance on retained knowledge. It introduces MixUnlearn, a generator–unlearner framework that uses an adversarial mixup generator to produce hard samples from Forgetting and Remaining data, coupled with two contrastive losses to regularize forgetting and retention. The method achieves superior or competitive unlearning performance across class- and data-level tasks on datasets like CIFAR-10, SVHN, MNIST, and Fashion-MNIST, with robustness to noisy and semi-supervised settings and notable efficiency gains via a lightweight MixBlock and periodic generator updates. Extensive analyses—ablations, representation visualizations, KDE loss-distributions, and large-scale ImageNet/Vit experiments—support MixUnlearn as an effective, scalable approach to approximate machine unlearning that mitigates catastrophic forgetting while preserving essential knowledge and generalization.
Abstract
Machine unlearning is a critical area of research aimed at safeguarding data privacy by enabling the removal of sensitive information from machine learning models. One unique challenge in this field is catastrophic unlearning, where erasing specific data from a well-trained model unintentionally removes essential knowledge, causing the model to deviate significantly from a retrained one. To address this, we introduce a novel approach that regularizes the unlearning process by utilizing synthesized mixup samples, which simulate the data susceptible to catastrophic effects. At the core of our approach is a generator-unlearner framework, MixUnlearn, where a generator adversarially produces challenging mixup examples, and the unlearner effectively forgets target information based on these synthesized data. Specifically, we first introduce a novel contrastive objective to train the generator in an adversarial direction: generating examples that prompt the unlearner to reveal information that should be forgotten, while losing essential knowledge. Then the unlearner, guided by two other contrastive loss terms, processes the synthesized and real data jointly to ensure accurate unlearning without losing critical knowledge, overcoming catastrophic effects. Extensive evaluations across benchmark datasets demonstrate that our method significantly outperforms state-of-the-art approaches, offering a robust solution to machine unlearning. This work not only deepens understanding of unlearning mechanisms but also lays the foundation for effective machine unlearning with mixup augmentation.
