Learning to Unlearn for Robust Machine Unlearning
Mark He Huang, Lin Geng Foo, Jun Liu
TL;DR
MU seeks to erase knowledge of samples in $\mathcal{D}_f$ while preserving performance on $\mathcal{D}_r$, but balancing forgetting with generalization is difficult. The authors propose Learning-to-Unlearn (LTU), a meta-learning framework that uses a three-phase scheme (meta-tune, meta-test, meta-update) and a Gradient Harmonization component to align forgetting and remembering; feedback signals come from a small subset $\hat{\mathcal{D}}_r$ and from $K$ differentiable Membership Inference models operating on an audit set derived from $\mathcal{D}_f$. LTU formalizes support/query construction with distinct tasks for remembering and forgetting and optimizes through a combined objective that leverages generalized feedback to improve both objectives. Empirical results on CIFAR-100 and Tiny-ImageNet with ResNet-18 and ViT show LTU achieving state-of-the-art unlearning performance under partial data access, with ablations confirming the necessity of meta-optimization and gradient harmonization for robust MU in practice.
Abstract
Machine unlearning (MU) seeks to remove knowledge of specific data samples from trained models without the necessity for complete retraining, a task made challenging by the dual objectives of effective erasure of data and maintaining the overall performance of the model. Despite recent advances in this field, balancing between the dual objectives of unlearning remains challenging. From a fresh perspective of generalization, we introduce a novel Learning-to-Unlearn (LTU) framework, which adopts a meta-learning approach to optimize the unlearning process to improve forgetting and remembering in a unified manner. LTU includes a meta-optimization scheme that facilitates models to effectively preserve generalizable knowledge with only a small subset of the remaining set, while thoroughly forgetting the specific data samples. We also introduce a Gradient Harmonization strategy to align the optimization trajectories for remembering and forgetting via mitigating gradient conflicts, thus ensuring efficient and effective model updates. Our approach demonstrates improved efficiency and efficacy for MU, offering a promising solution to the challenges of data rights and model reusability.
