Machine Unlearning for Image-to-Image Generative Models

Guihong Li; Hsiang Hsu; Chun-Fu Chen; Radu Marculescu

Machine Unlearning for Image-to-Image Generative Models

Guihong Li, Hsiang Hsu, Chun-Fu Chen, Radu Marculescu

TL;DR

This work addresses the challenge of machine unlearning for image-to-image (I2I) generative models, where removing sensitive forget data without full retraining is essential. It develops a unifying, encoder-centric framework that leverages mutual information and a tractable L2 surrogate to bound KL divergences, enabling efficient unlearning across MAE, VQ-GAN, and diffusion models. The method preserves retain-set distributions while erasing forget-set information, demonstrated on ImageNet-1K and Places-365 with both full and proxy retain data, and supported by T-SNE and CLIP analyses. This approach advances data privacy and regulatory compliance for generative models, offering scalable, cross-model applicability and practical pathways for future extensions to other modalities and benchmarks.

Abstract

Machine unlearning has emerged as a new paradigm to deliberately forget data samples from a given model in order to adhere to stringent regulations. However, existing machine unlearning methods have been primarily focused on classification models, leaving the landscape of unlearning for generative models relatively unexplored. This paper serves as a bridge, addressing the gap by providing a unifying framework of machine unlearning for image-to-image generative models. Within this framework, we propose a computationally-efficient algorithm, underpinned by rigorous theoretical analysis, that demonstrates negligible performance degradation on the retain samples, while effectively removing the information from the forget samples. Empirical studies on two large-scale datasets, ImageNet-1K and Places-365, further show that our algorithm does not rely on the availability of the retain samples, which further complies with data retention policy. To our best knowledge, this work is the first that represents systemic, theoretical, empirical explorations of machine unlearning specifically tailored for image-to-image generative models. Our code is available at https://github.com/jpmorganchase/l2l-generator-unlearning.

Machine Unlearning for Image-to-Image Generative Models

TL;DR

Abstract

Paper Structure (44 sections, 4 theorems, 33 equations, 21 figures, 4 tables, 1 algorithm)

This paper contains 44 sections, 4 theorems, 33 equations, 21 figures, 4 tables, 1 algorithm.

Introduction
Related Work
I2I generative models.
Machine unlearning.
Problem Formulation and Proposed Approach
Definition of Unlearning on I2I Generative Models
Optimization on Retain and Forget sets
Proposed Approach
Efficient Unlearning Approach.
Experimental Results
Experimental Setup
Performance Analysis and Visualization
T-SNE analysis.
Robustness to Retain Samples Availability
Ablation Study
...and 29 more sections

Key Result

Lemma 1

Given the distribution of the forget samples $\mathcal{P}_{{X_f}}$ with zero-mean and covariance matrix $\Sigma$, consider another signal $\mathcal{P}_{\hat{X}_f}$ which shares the same mean and covariance matrix. The maximal KL-divergence between $\mathcal{P}_{{X_f}}$ and $\mathcal{P}_{\hat{X}_f}$

Figures (21)

Figure 1: Our machine unlearning framework is applicable to various types of I2I generative models, including the diffusion models palette_diff_base, VQ-GAN vqgan_mage and MAE mae_masked_he (cf. Section \ref{['sec:exp']}). The images in the retain set remain almost (up to a slight difference due to the perplexity of generative models) unaffected before and after unlearning. Conversely, the images in the forget set are nearly noise after unlearning, as designed.
Figure 2: Overview of our approach. On $\mathcal{D}_F$, we minimize the $L_2$-loss between embedding vectors of the forget samples $x_f$ and embedding vectors of Gaussian noise $n$. On $\mathcal{D}_R$, we minimize the $L_2$-loss between the same image embedding vectors generated by target model encoder and the original model encoder.
Figure 3: Results of cropping $8\times 8$ patches at the center of the image on diffusion models, where each patch is $16 \times 16$ pixels. Our method has negligible-to-slight performance degradation on diverse I2I generative models and multiple generative tasks. (cf. Appendix \ref{['app:supple_results']} and \ref{['app:varyingmaskingtype_ratio']}).
Figure 4: T-SNE analysis of the generated images by our approach and ground truth images. After unlearning, the generated retain samples are close to or overlapping with the ground truth (orange vs. blue), while most of generated forget images diverge far from the ground truth (green vs. red).
Figure C.5: Covariance matrix of three commonly datasets. For CIFAR10/100, we convert the images into gray-scale images. We take the absolute value of the covariance matrix for better illustration.
...and 16 more figures

Theorems & Definitions (4)

Lemma 1
Theorem 1
Proposition 1
Proposition 2

Machine Unlearning for Image-to-Image Generative Models

TL;DR

Abstract

Machine Unlearning for Image-to-Image Generative Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (4)