Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models

Kyungryeol Lee; Kyeonghyun Lee; Seongmin Hong; Byung Hyun Lee; Se Young Chun

Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models

Kyungryeol Lee, Kyeonghyun Lee, Seongmin Hong, Byung Hyun Lee, Se Young Chun

TL;DR

This work introduces an effective surrogate-based unlearning method that leverages image editing, timestep-aware weighting, and gradient surgery to guide trained diffusion models toward forgetting specific outputs, and uniquely unlearns unpromptable outputs with preserved integrity.

Abstract

Machine unlearning aims to remove specific outputs from trained models, often at the concept level, such as forgetting all occurrences of a particular celebrity or filtering content via text prompts. However, many undesired outputs, such as an individual's face or generations culturally or factually misinterpreted, cannot often be specified by text prompts. We address this underexplored setting of instance unlearning for outputs that are undesired but unpromptable, where the goal is to forget target outputs selectively while preserving the rest. To this end, we introduce an effective surrogate-based unlearning method that leverages image editing, timestep-aware weighting, and gradient surgery to guide trained diffusion models toward forgetting specific outputs. Experiments on conditional (Stable Diffusion 3) and unconditional (DDPM-CelebA) diffusion models demonstrate that our prompt-free method uniquely unlearns unpromptable outputs, such as faces and culturally inaccurate depictions, with preserved integrity, unlike prompt-based and prompt-free baselines. Our proposed method would serve as a practical hotfix for diffusion model providers to ensure privacy protection and ethical compliance.

Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models

TL;DR

Abstract

Paper Structure (21 sections, 3 theorems, 47 equations, 11 figures, 6 tables, 1 algorithm)

This paper contains 21 sections, 3 theorems, 47 equations, 11 figures, 6 tables, 1 algorithm.

Introduction
Problem Formulation
Diffusion Models
Prompt-based Unlearning for Diffusion Models
Prompt-free Instance Unlearning for Diffusion Models
Related Work and Challenges
Related Work
Challenges: The Fundamental Mismatch in Applying Prompt-based to Prompt-free Unlearning
Methods
Unlearning Methods
Surrogate vs. Exact Unlearning: A Theoretical View
Surrogate Dataset Construction
Experiments
Instance Unlearning on Unconditional Diffusion Models
Correcting Unpromptable Misrepresentations through Instance Unlearning
...and 6 more sections

Key Result

Theorem 1

Let $\theta^*\in \mathbb{R}^d$ be the parameter vector obtained by solving the ridge-regression problem: where $X \in \mathbb{R}^{n\times d}$, $y\in \mathbb{R}^n$, and $\lambda \ge 0$. Denote so that Now remove the $i$-th row $\bigl(x_i, y_i\bigr)$ from $X, y$, producing $\widetilde{X}, \widetilde{y}$. The new solution, trained from scratch on $\widetilde{X}, \widetilde{y}$, is Then, the diffe

Figures (11)

Figure 1: Challenge and our solution for instance unlearning in diffusion models.
Figure 2: Cultural and semantic misrepresentation highlight the need for instance unlearning in commercial generative models.
Figure 3: Comparison between (a) prompt-based and (b) instance unlearning.
Figure 4: Surrogate-based unlearning ($\theta^\dagger$) can be better than exact unlearing ($\Tilde{\theta}$) in mapping preservation, i.e., the line is closer to the original ($\theta^*$).
Figure 5: Surrogate data construction of (left) CelebA with TediGAN, (middle) SD3 with SDEdit, and (right) manual editing.
...and 6 more figures

Theorems & Definitions (6)

Theorem 1: Exact Unlearning, restated from golub1979generalized
proof
Theorem 2: Surrogate-based Unlearning
proof
Corollary 3: Comparison
proof

Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models

TL;DR

Abstract

Unlearning the Unpromptable: Prompt-free Instance Unlearning in Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (6)