Table of Contents
Fetching ...

Knowledge Swapping via Learning and Unlearning

Mingyu Xing, Lechao Cheng, Shengeng Tang, Yaxiong Wang, Zhun Zhong, Meng Wang

TL;DR

Knowledge Swapping proposes a unified task to forget specified knowledge while retaining essentials and acquiring new information. The authors reveal a knock-on feature hierarchy where learning and forgetting progress in opposite directions, motivating a Learning Before Forgetting approach implemented via LoRA-based fine-tuning and group sparse regularization. Across classification, segmentation, and detection, this two-stage strategy achieves strong learning on new content, robust forgetting of targeted knowledge, and stable retention of prior capabilities, outperforming alternative orderings. The work provides a practical framework for controlled knowledge management in pretrained models with broad implications for privacy, security, and continual learning applications.

Abstract

We introduce \textbf{Knowledge Swapping}, a novel task designed to selectively regulate knowledge of a pretrained model by enabling the forgetting of user\-specified information, retaining essential knowledge, and acquiring new knowledge simultaneously. By delving into the analysis of knock-on feature hierarchy, we find that incremental learning typically progresses from low\-level representations to higher\-level semantics, whereas forgetting tends to occur in the opposite direction\-starting from high-level semantics and moving down to low-level features. Building upon this, we propose to benchmark the knowledge swapping task with the strategy of \textit{Learning Before Forgetting}. Comprehensive experiments on various tasks like image classification, object detection, and semantic segmentation validate the effectiveness of the proposed strategy. The source code is available at \href{https://github.com/xingmingyu123456/KnowledgeSwapping}{https://github.com/xingmingyu123456/KnowledgeSwapping}.

Knowledge Swapping via Learning and Unlearning

TL;DR

Knowledge Swapping proposes a unified task to forget specified knowledge while retaining essentials and acquiring new information. The authors reveal a knock-on feature hierarchy where learning and forgetting progress in opposite directions, motivating a Learning Before Forgetting approach implemented via LoRA-based fine-tuning and group sparse regularization. Across classification, segmentation, and detection, this two-stage strategy achieves strong learning on new content, robust forgetting of targeted knowledge, and stable retention of prior capabilities, outperforming alternative orderings. The work provides a practical framework for controlled knowledge management in pretrained models with broad implications for privacy, security, and continual learning applications.

Abstract

We introduce \textbf{Knowledge Swapping}, a novel task designed to selectively regulate knowledge of a pretrained model by enabling the forgetting of user\-specified information, retaining essential knowledge, and acquiring new knowledge simultaneously. By delving into the analysis of knock-on feature hierarchy, we find that incremental learning typically progresses from low\-level representations to higher\-level semantics, whereas forgetting tends to occur in the opposite direction\-starting from high-level semantics and moving down to low-level features. Building upon this, we propose to benchmark the knowledge swapping task with the strategy of \textit{Learning Before Forgetting}. Comprehensive experiments on various tasks like image classification, object detection, and semantic segmentation validate the effectiveness of the proposed strategy. The source code is available at \href{https://github.com/xingmingyu123456/KnowledgeSwapping}{https://github.com/xingmingyu123456/KnowledgeSwapping}.

Paper Structure

This paper contains 19 sections, 9 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Comparison of three tasks: Continuous Learning, Machine Unlearning, and our Knowledge Swapping.
  • Figure 2: $\mathcal{L}_2$ norm for each parameter under $L\rightarrow F$ and $F\rightarrow L$. The superscript $W$ denotes the weight norm value at the current stage. The figure illustrates that (a) during the Learning Before Forgetting phase, changes in parameter norms are predominantly concentrated in layers responsible for high-level semantic representations. Conversely, (b) in the Learning After Forgetting phase, parameter norm changes primarily occur in layers associated with low-level feature representations.
  • Figure 3: Benchmark Framework. First, we decouple knowledge swapping into separate learning and forgetting processes. We observed that the learning process progresses from low-level features to high-level features, while the forgetting process proceeds in the opposite direction—from high-level features to low-level features. Therefore, a two-stage strategy of Learning Before Forgetting is adopted. In general, we adopt LoRA to fine-tune the linear layers in each Transformer block, with all other parameters frozen to enable selective regulation of the model knowledge.
  • Figure 4: Logarithm of the Average Gradient. We compute the logarithm of cumulative average gradient changes at different stages in the $L \rightarrow F$ and $F \rightarrow L$ processes. We observe two key phenomena: first, parameter changes during the learning phases ($L^G \rightarrow F$ and $F \rightarrow L^G$) are consistently more significant, indicating that the Learning process is relatively challenging; second, in the $L \rightarrow F^G$ phase, the final updates to forgetting gradients remain consistently small, suggesting that Learning Before Forgetting is more stable.
  • Figure 5: Qualitative results on semantic segmentation. The forgotten classes are marked with red dotted lines, and the learned class is marked with dark green dotted lines.
  • ...and 2 more figures