Machine Unlearning in Contrastive Learning

Zixin Wang; Kongyang Chen

Machine Unlearning in Contrastive Learning

Zixin Wang, Kongyang Chen

TL;DR

The paper addresses privacy-centric data deletion by proposing a gradient-penalty-based approximate unlearning method applicable to both contrastive/self-supervised and supervised models. It builds a two-stage loss framework, starting with MEMtrain and a gradient-penalty term, then simplifies to MEMtrain+MEMGP to require only member-data gradients, enabling forgetting with minimal accuracy loss (~10%). The approach defends against membership inference attacks, demonstrates effectiveness across contrastive architectures (MoCo, SimCLR, BYOL) and ResNet supervision, and provides both encoder-focused analyses and visualization to validate unlearning. The method is simple to implement, framework-agnostic, and requires only a handful of training epochs, offering a practical path to regulatory-compliant data forgetting in modern AI systems.

Abstract

Machine unlearning is a complex process that necessitates the model to diminish the influence of the training data while keeping the loss of accuracy to a minimum. Despite the numerous studies on machine unlearning in recent years, the majority of them have primarily focused on supervised learning models, leaving research on contrastive learning models relatively underexplored. With the conviction that self-supervised learning harbors a promising potential, surpassing or rivaling that of supervised learning, we set out to investigate methods for machine unlearning centered around contrastive learning models. In this study, we introduce a novel gradient constraint-based approach for training the model to effectively achieve machine unlearning. Our method only necessitates a minimal number of training epochs and the identification of the data slated for unlearning. Remarkably, our approach demonstrates proficient performance not only on contrastive learning models but also on supervised learning models, showcasing its versatility and adaptability in various learning paradigms.

Machine Unlearning in Contrastive Learning

TL;DR

Abstract

Paper Structure (11 sections, 3 equations, 9 figures, 7 tables, 2 algorithms)

This paper contains 11 sections, 3 equations, 9 figures, 7 tables, 2 algorithms.

Introduction
Related Work
Gradient penalty-based unlearning method
gradient penalty:
Our Design Objectives:
Our Method
Model Unlearning Review
Experiments
Experimental:
Experimental Results:
Conclusion

Figures (9)

Figure 1: Use gradient penalty before
Figure 2: Use gradient penalty after
Figure 3: The graph represents the shape of the distribution of the predicted probabilities of the data for models with different degrees of overfitting
Figure 4: The graph represents the change in cosine similarity between the training and non-training data in the later stages before performing machine unlearning
Figure 5: This figure shows the change in loss of a batch of training and non-training data before and after performing machine unlearning
...and 4 more figures

Machine Unlearning in Contrastive Learning

TL;DR

Abstract

Machine Unlearning in Contrastive Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)