Table of Contents
Fetching ...

How Diffusion Models Memorize

Juyeop Kim, Songkuk Kim, Jong-Seok Lee

TL;DR

Memorization in diffusion models poses privacy and copyright risks. The authors analyze denoising dynamics in latent space, introducing a decomposition framework and showing that early overestimation—driven by classifier-free guidance and the injection of memorized content—causes latent trajectories to converge toward memorized samples. They demonstrate that memorization cannot be fully explained by overfitting, as early denoising can exhibit higher loss under memorization, and that conditional noise predictions carry memorized information while unconditional predictions do not. A decomposition of intermediate latents reveals how early dominance of the memorized content correlates with memorization severity, and results hold across multiple SD variants and RealisticVision. Practically, the work suggests mitigation by delaying classifier-free guidance in early denoising to preserve latent diversity and prevent rapid replication of training data.

Abstract

Despite their success in image generation, diffusion models can memorize training data, raising serious privacy and copyright concerns. Although prior work has sought to characterize, detect, and mitigate memorization, the fundamental question of why and how it occurs remains unresolved. In this paper, we revisit the diffusion and denoising process and analyze latent space dynamics to address the question: "How do diffusion models memorize?" We show that memorization is driven by the overestimation of training samples during early denoising, which reduces diversity, collapses denoising trajectories, and accelerates convergence toward the memorized image. Specifically: (i) memorization cannot be explained by overfitting alone, as training loss is larger under memorization due to classifier-free guidance amplifying predictions and inducing overestimation; (ii) memorized prompts inject training images into noise predictions, forcing latent trajectories to converge and steering denoising toward their paired samples; and (iii) a decomposition of intermediate latents reveals how initial randomness is quickly suppressed and replaced by memorized content, with deviations from the theoretical denoising schedule correlating almost perfectly with memorization severity. Together, these results identify early overestimation as the central underlying mechanism of memorization in diffusion models.

How Diffusion Models Memorize

TL;DR

Memorization in diffusion models poses privacy and copyright risks. The authors analyze denoising dynamics in latent space, introducing a decomposition framework and showing that early overestimation—driven by classifier-free guidance and the injection of memorized content—causes latent trajectories to converge toward memorized samples. They demonstrate that memorization cannot be fully explained by overfitting, as early denoising can exhibit higher loss under memorization, and that conditional noise predictions carry memorized information while unconditional predictions do not. A decomposition of intermediate latents reveals how early dominance of the memorized content correlates with memorization severity, and results hold across multiple SD variants and RealisticVision. Practically, the work suggests mitigation by delaying classifier-free guidance in early denoising to preserve latent diversity and prevent rapid replication of training data.

Abstract

Despite their success in image generation, diffusion models can memorize training data, raising serious privacy and copyright concerns. Although prior work has sought to characterize, detect, and mitigate memorization, the fundamental question of why and how it occurs remains unresolved. In this paper, we revisit the diffusion and denoising process and analyze latent space dynamics to address the question: "How do diffusion models memorize?" We show that memorization is driven by the overestimation of training samples during early denoising, which reduces diversity, collapses denoising trajectories, and accelerates convergence toward the memorized image. Specifically: (i) memorization cannot be explained by overfitting alone, as training loss is larger under memorization due to classifier-free guidance amplifying predictions and inducing overestimation; (ii) memorized prompts inject training images into noise predictions, forcing latent trajectories to converge and steering denoising toward their paired samples; and (iii) a decomposition of intermediate latents reveals how initial randomness is quickly suppressed and replaced by memorized content, with deviations from the theoretical denoising schedule correlating almost perfectly with memorization severity. Together, these results identify early overestimation as the central underlying mechanism of memorization in diffusion models.

Paper Structure

This paper contains 22 sections, 27 equations, 24 figures, 1 table.

Figures (24)

  • Figure 1: Guidance amplifies the presence of $\mathbf{x}$. Squared $\ell_{2}$ distance (x-axis; log scale) and cosine similarity (y-axis) between $\hat{\mathbf{x}}_{0}^{(t)}$ and $\mathbf{x}$ after different number of denoising steps (column). The top row corresponds to $g=1.0$, and the bottom row to $g=7.5$. Point color denotes SSCD score.
  • Figure 2: Lack of guidance degrades quality. Generated images (a) without classifier-free guidance ($g=1.0$) and (b) with classifier-free guidance ($g=7.5$).
  • Figure 3: Guidance drives memorization. SSCD scores with (y-axis) and without (x-axis) classifier-free guidance.
  • Figure 4: Memorization emerges from the very first step. (a) Training images $\mathbf{x}$ and (b) their first-step predictions $\hat{\mathbf{x}}_{0}^{(T)}$ from paired memorized prompts $c$ (SSCD score $\geq 0.75$) under $g=7.5$.
  • Figure 5: Conditional noise prediction captures memorized data. Cosine similarity between noise predictions and latents at $t=T$, for normal (blue; SSCD $<0.75$) and memorized (red; SSCD $\geq0.75$) prompts under $g=7.5$.
  • ...and 19 more figures