Sharpness-Aware Parameter Selection for Machine Unlearning
Saber Malekmohammadi, Hong kyu Lee, Li Xiong
TL;DR
The paper addresses the problem of unlearning sensitive information from trained models by proposing a sharpness-aware parameter selection strategy that updates only a small subset of parameters. This subset is identified via the smallest diagonal Hessian entries $H_{learn}(oldsymbol\theta^*)$, aligning with sharpness-aware minimization and yielding robust unlearning against relearning. The authors provide theoretical connections to robust unlearning and approximate second-order updates, and demonstrate empirically that updating these salient parameters achieves higher unlearning efficacy with lower computational cost on MNIST and CIFAR-10. The work offers a scalable approach for feature/label unlearning and suggests promising directions for applying the method to larger models and LLMs.
Abstract
It often happens that some sensitive personal information, such as credit card numbers or passwords, are mistakenly incorporated in the training of machine learning models and need to be removed afterwards. The removal of such information from a trained model is a complex task that needs to partially reverse the training process. There have been various machine unlearning techniques proposed in the literature to address this problem. Most of the proposed methods revolve around removing individual data samples from a trained model. Another less explored direction is when features/labels of a group of data samples need to be reverted. While the existing methods for these tasks do the unlearning task by updating the whole set of model parameters or only the last layer of the model, we show that there are a subset of model parameters that have the largest contribution in the unlearning target features. More precisely, the model parameters with the largest corresponding diagonal value in the Hessian matrix (computed at the learned model parameter) have the most contribution in the unlearning task. By selecting these parameters and updating them during the unlearning stage, we can have the most progress in unlearning. We provide theoretical justifications for the proposed strategy by connecting it to sharpness-aware minimization and robust unlearning. We empirically show the effectiveness of the proposed strategy in improving the efficacy of unlearning with a low computational cost.
