Learning to Unlearn for Robust Machine Unlearning

Mark He Huang; Lin Geng Foo; Jun Liu

Learning to Unlearn for Robust Machine Unlearning

Mark He Huang, Lin Geng Foo, Jun Liu

TL;DR

MU seeks to erase knowledge of samples in $\mathcal{D}_f$ while preserving performance on $\mathcal{D}_r$, but balancing forgetting with generalization is difficult. The authors propose Learning-to-Unlearn (LTU), a meta-learning framework that uses a three-phase scheme (meta-tune, meta-test, meta-update) and a Gradient Harmonization component to align forgetting and remembering; feedback signals come from a small subset $\hat{\mathcal{D}}_r$ and from $K$ differentiable Membership Inference models operating on an audit set derived from $\mathcal{D}_f$. LTU formalizes support/query construction with distinct tasks for remembering and forgetting and optimizes through a combined objective that leverages generalized feedback to improve both objectives. Empirical results on CIFAR-100 and Tiny-ImageNet with ResNet-18 and ViT show LTU achieving state-of-the-art unlearning performance under partial data access, with ablations confirming the necessity of meta-optimization and gradient harmonization for robust MU in practice.

Abstract

Machine unlearning (MU) seeks to remove knowledge of specific data samples from trained models without the necessity for complete retraining, a task made challenging by the dual objectives of effective erasure of data and maintaining the overall performance of the model. Despite recent advances in this field, balancing between the dual objectives of unlearning remains challenging. From a fresh perspective of generalization, we introduce a novel Learning-to-Unlearn (LTU) framework, which adopts a meta-learning approach to optimize the unlearning process to improve forgetting and remembering in a unified manner. LTU includes a meta-optimization scheme that facilitates models to effectively preserve generalizable knowledge with only a small subset of the remaining set, while thoroughly forgetting the specific data samples. We also introduce a Gradient Harmonization strategy to align the optimization trajectories for remembering and forgetting via mitigating gradient conflicts, thus ensuring efficient and effective model updates. Our approach demonstrates improved efficiency and efficacy for MU, offering a promising solution to the challenges of data rights and model reusability.

Learning to Unlearn for Robust Machine Unlearning

TL;DR

MU seeks to erase knowledge of samples in

while preserving performance on

, but balancing forgetting with generalization is difficult. The authors propose Learning-to-Unlearn (LTU), a meta-learning framework that uses a three-phase scheme (meta-tune, meta-test, meta-update) and a Gradient Harmonization component to align forgetting and remembering; feedback signals come from a small subset

and from

differentiable Membership Inference models operating on an audit set derived from

. LTU formalizes support/query construction with distinct tasks for remembering and forgetting and optimizes through a combined objective that leverages generalized feedback to improve both objectives. Empirical results on CIFAR-100 and Tiny-ImageNet with ResNet-18 and ViT show LTU achieving state-of-the-art unlearning performance under partial data access, with ablations confirming the necessity of meta-optimization and gradient harmonization for robust MU in practice.

Abstract

Paper Structure (15 sections, 12 equations, 4 figures, 2 tables)

This paper contains 15 sections, 12 equations, 4 figures, 2 tables.

Introduction
Related Work
Machine Unlearning
Meta Learning
Method
Meta Optimization Scheme
Support and Query Formation
Support and query for remembering feedback
Support and query for forgetting feedback
Gradient Harmonization
Experiments
Experimental Setups
Experimental Results
Ablation Studies
Conclusion

Figures (4)

Figure 1: (a) Illustration of our meta optimization scheme, which is performed in three phases: meta-tune, meta-test and meta-update. Before performing meta optimization, we first construct the support set $\mathcal{S}$ and query sets $\{ \mathcal{Q}^i \}_{i=1}^N$ (highlighted in orange), which are disjoint from each other. Then, meta-tune is performed on the support set $\mathcal{S}$ to obtain a temporarily updated model $\theta^\tau$, followed by meta-test evaluations of $\theta^\tau$ using the query sets $\{ \mathcal{Q}^i \}_{i=1}^N$, which provides feedback on the generalization of the "remembering" or "forgetting". The feedback from these two phases are combined into meta-update, which computes a gradient to update the model $\theta$ in a generalizable manner. (b) Overview of our entire LTU pipeline, which includes one part to improve "remembering" (top half) and one part to improve "forgetting" (lower half). Specifically, to improve "remembering" with only a small subset $\hat{\mathcal{D}_r}$ of the remaining set, we use $\hat{\mathcal{D}_r}$ to construct $N$ query sets. In contrast, to improve "forgetting", we sample from $K$ membership inference models $\{ \theta_{\mathcal{A}}^k \}_{k=1}^K$ to construct our support and query sets. Lastly, a Gradient Harmonization strategy is proposed to reduce the conflicts between the gradients from both parts, which further improves unlearning.
Figure 2: Illustration of our Gradient Harmonization strategy. (a) Updating with $g_f$ directly might be harmful, as $g_f$ might be in a largely conflicting direction from $g_r$ (highlighted in red), which can result in undoing much of the effects of $g_r$. (b) Updating jointly with $g_r + g_f$ can still result in the joint gradient canceling the effect of one of the gradient updates. (c) By using our Gradient Harmonization strategy, we resolve the gradient conflicts, where $g_f^\prime$ is in an orthogonal direction to $g_r$ which reduces the negative impact on the remembering objective.
Figure 3: Impact of our meta optimization scheme.
Figure 4: Impact of our Gradient Harmonization strategy.

Learning to Unlearn for Robust Machine Unlearning

TL;DR

Abstract

Learning to Unlearn for Robust Machine Unlearning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)