Table of Contents
Fetching ...

Contrastive Unlearning: A Contrastive Approach to Machine Unlearning

Hong kyu Lee, Qiuchen Zhang, Carl Yang, Jian Lou, Li Xiong

TL;DR

This work tackles the challenge of machine unlearning by proposing a contrastive unlearning framework that operates in the latent representation space. By contrasting unlearning samples against remaining data, it reshapes embeddings to detach from original classes while preserving the representations of non-forgotten data, and combines a specialized unlearning loss with a cross-entropy term to maintain utility. Empirical results on CIFAR-10 and SVHN show that the approach achieves strong unlearning effectiveness with minimal performance loss and superior efficiency compared with state-of-the-art methods, and is validated further via membership inference attacks. The proposed method offers a scalable, representation-space based solution for privacy-preserving unlearning with broad applicability to both class- and sample-level forgetting.

Abstract

Machine unlearning aims to eliminate the influence of a subset of training samples (i.e., unlearning samples) from a trained model. Effectively and efficiently removing the unlearning samples without negatively impacting the overall model performance is still challenging. In this paper, we propose a contrastive unlearning framework, leveraging the concept of representation learning for more effective unlearning. It removes the influence of unlearning samples by contrasting their embeddings against the remaining samples so that they are pushed away from their original classes and pulled toward other classes. By directly optimizing the representation space, it effectively removes the influence of unlearning samples while maintaining the representations learned from the remaining samples. Experiments on a variety of datasets and models on both class unlearning and sample unlearning showed that contrastive unlearning achieves the best unlearning effects and efficiency with the lowest performance loss compared with the state-of-the-art algorithms.

Contrastive Unlearning: A Contrastive Approach to Machine Unlearning

TL;DR

This work tackles the challenge of machine unlearning by proposing a contrastive unlearning framework that operates in the latent representation space. By contrasting unlearning samples against remaining data, it reshapes embeddings to detach from original classes while preserving the representations of non-forgotten data, and combines a specialized unlearning loss with a cross-entropy term to maintain utility. Empirical results on CIFAR-10 and SVHN show that the approach achieves strong unlearning effectiveness with minimal performance loss and superior efficiency compared with state-of-the-art methods, and is validated further via membership inference attacks. The proposed method offers a scalable, representation-space based solution for privacy-preserving unlearning with broad applicability to both class- and sample-level forgetting.

Abstract

Machine unlearning aims to eliminate the influence of a subset of training samples (i.e., unlearning samples) from a trained model. Effectively and efficiently removing the unlearning samples without negatively impacting the overall model performance is still challenging. In this paper, we propose a contrastive unlearning framework, leveraging the concept of representation learning for more effective unlearning. It removes the influence of unlearning samples by contrasting their embeddings against the remaining samples so that they are pushed away from their original classes and pulled toward other classes. By directly optimizing the representation space, it effectively removes the influence of unlearning samples while maintaining the representations learned from the remaining samples. Experiments on a variety of datasets and models on both class unlearning and sample unlearning showed that contrastive unlearning achieves the best unlearning effects and efficiency with the lowest performance loss compared with the state-of-the-art algorithms.
Paper Structure (11 sections, 7 equations, 2 figures, 7 tables, 1 algorithm)

This paper contains 11 sections, 7 equations, 2 figures, 7 tables, 1 algorithm.

Figures (2)

  • Figure 1: Visualization of representation spaces of contrastive unlearning, gradient ascent, and finetune.
  • Figure 2: Accuracy on unlearning class vs. number of batches on $\mathcal{D}^u_{tr}$.