Table of Contents
Fetching ...

eCIL-MU: Embedding based Class Incremental Learning and Machine Unlearning

Zhiwei Zuo, Zhuo Tang, Bin Wang, Kenli Li, Anwitaman Datta

TL;DR

The paper addresses the need for continual learning with the ability to forget outdated categories in dynamic environments. It proposes eCIL-MU, an embedding-based framework that stores latent representations in vector databases and migrates vectors between a CIL store (DB-CIL) and a MU store (DB-MU) using cosine similarity. Inference includes a vector-filter to route potential unlearned inputs and four strategies for producing outputs for unlearned data, with the shift-to-nearest-class approach offering best robustness. The approach leverages a shared embedding space to accelerate both learning and unlearning, achieving up to about $278×$ speedups and effective unlearning on CIFAR-10/100 compared to retraining baselines, with practical implications for privacy-preserving updates in evolving domains. The use of a cosine-based similarity and a dual-vector-database architecture enables non-destructive updates and scalable management of absorbing and discarding classes.

Abstract

New categories may be introduced over time, or existing categories may need to be reclassified. Class incremental learning (CIL) is employed for the gradual acquisition of knowledge about new categories while preserving information about previously learned ones in such dynamic environments. It might also be necessary to also eliminate the influence of related categories on the model to adapt to reclassification. We thus introduce class-level machine unlearning (MU) within CIL. Typically, MU methods tend to be time-consuming and can potentially harm the model's performance. A continuous stream of unlearning requests could lead to catastrophic forgetting. To address these issues, we propose a non-destructive eCIL-MU framework based on embedding techniques to map data into vectors and then be stored in vector databases. Our approach exploits the overlap between CIL and MU tasks for acceleration. Experiments demonstrate the capability of achieving unlearning effectiveness and orders of magnitude (upto $\sim 278\times$) of acceleration.

eCIL-MU: Embedding based Class Incremental Learning and Machine Unlearning

TL;DR

The paper addresses the need for continual learning with the ability to forget outdated categories in dynamic environments. It proposes eCIL-MU, an embedding-based framework that stores latent representations in vector databases and migrates vectors between a CIL store (DB-CIL) and a MU store (DB-MU) using cosine similarity. Inference includes a vector-filter to route potential unlearned inputs and four strategies for producing outputs for unlearned data, with the shift-to-nearest-class approach offering best robustness. The approach leverages a shared embedding space to accelerate both learning and unlearning, achieving up to about speedups and effective unlearning on CIFAR-10/100 compared to retraining baselines, with practical implications for privacy-preserving updates in evolving domains. The use of a cosine-based similarity and a dual-vector-database architecture enables non-destructive updates and scalable management of absorbing and discarding classes.

Abstract

New categories may be introduced over time, or existing categories may need to be reclassified. Class incremental learning (CIL) is employed for the gradual acquisition of knowledge about new categories while preserving information about previously learned ones in such dynamic environments. It might also be necessary to also eliminate the influence of related categories on the model to adapt to reclassification. We thus introduce class-level machine unlearning (MU) within CIL. Typically, MU methods tend to be time-consuming and can potentially harm the model's performance. A continuous stream of unlearning requests could lead to catastrophic forgetting. To address these issues, we propose a non-destructive eCIL-MU framework based on embedding techniques to map data into vectors and then be stored in vector databases. Our approach exploits the overlap between CIL and MU tasks for acceleration. Experiments demonstrate the capability of achieving unlearning effectiveness and orders of magnitude (upto ) of acceleration.
Paper Structure (6 sections, 9 equations, 5 figures, 1 table)

This paper contains 6 sections, 9 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Training phase and inference phase of eCIL-MU
  • Figure 2: Retraining from scratch and restoring and resuming training are serial processes. In contrast, eCIL-MU enables partial parallelization, allowing overlap once $M_e$ embeds $C_f$ or preceding CIL task ends.
  • Figure 3: t-SNEs of embedding vectors of CIFAR-10 mapped by $M_e$. Vectors in green belong to unlearning class, and will be transferred from DB-CIL to DB-MU.
  • Figure 4: (a) shows the predicted $C_f$ after filtering (via Eq.\ref{['threshold']}). Different symbols represent the ground-truth of these samples. Circles indicate classes in $C_f$, while others represent classes in $C_r$. (b) represents potential predictions for samples in (a) using various random methods. (c) illustrates predictions under the strategy shift-to-the-nearest-class.
  • Figure 5: (a): The time taken in log-scale ($10^y~s$) by each method in various task scenarios, as well as the proportion of CIL and MU task within the mixed CIL-MU. (b) illustrates the acceleration rates of restoring and resuming training, along with eCIL-MU, compared to retraining.