Table of Contents
Fetching ...

Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models

Jinxu Lin, Linwei Tao, Minjing Dong, Chang Xu

TL;DR

This paper tackles the problem of tracing training data influence in diffusion models, motivated by copyright and privacy concerns in image generation. It introduces Diffusion Attribution Score (DAS), a theoretically grounded metric that directly measures the KL-divergence between the predicted distributions with and without a given training sample, using changes in the diffusion noise predictor. DAS is derived via linearization and Newton-based leave-one-out updates, and it is paired with practical acceleration techniques (gradient projection, model compression, timesteps/subset screening) to scale to large models; extensive analysis shows that $D_{KL}(p_\theta(x^{gen}) \| p_{\theta\setminus i}(x^{gen}))$ can be estimated accurately and efficiently. Across CIFAR-2, ArtBench, and CelebA, DAS achieves state-of-the-art performance on the Linear Data-Modelling Score (LDS), is validated through counterfactual visualizations, and demonstrates robust attribution across datasets and inference settings, underscoring its practical impact for transparent and fair use of diffusion models.

Abstract

As diffusion models become increasingly popular, the misuse of copyrighted and private images has emerged as a major concern. One promising solution to mitigate this issue is identifying the contribution of specific training samples in generative models, a process known as data attribution. Existing data attribution methods for diffusion models typically quantify the contribution of a training sample by evaluating the change in diffusion loss when the sample is included or excluded from the training process. However, we argue that the direct usage of diffusion loss cannot represent such a contribution accurately due to the calculation of diffusion loss. Specifically, these approaches measure the divergence between predicted and ground truth distributions, which leads to an indirect comparison between the predicted distributions and cannot represent the variances between model behaviors. To address these issues, we aim to measure the direct comparison between predicted distributions with an attribution score to analyse the training sample importance, which is achieved by Diffusion Attribution Score (\textit{DAS}). Underpinned by rigorous theoretical analysis, we elucidate the effectiveness of DAS. Additionally, we explore strategies to accelerate DAS calculations, facilitating its application to large-scale diffusion models. Our extensive experiments across various datasets and diffusion models demonstrate that DAS significantly surpasses previous benchmarks in terms of the linear data-modelling score, establishing new state-of-the-art performance. Code is available at \hyperlink{here}{https://github.com/Jinxu-Lin/DAS}.

Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models

TL;DR

This paper tackles the problem of tracing training data influence in diffusion models, motivated by copyright and privacy concerns in image generation. It introduces Diffusion Attribution Score (DAS), a theoretically grounded metric that directly measures the KL-divergence between the predicted distributions with and without a given training sample, using changes in the diffusion noise predictor. DAS is derived via linearization and Newton-based leave-one-out updates, and it is paired with practical acceleration techniques (gradient projection, model compression, timesteps/subset screening) to scale to large models; extensive analysis shows that can be estimated accurately and efficiently. Across CIFAR-2, ArtBench, and CelebA, DAS achieves state-of-the-art performance on the Linear Data-Modelling Score (LDS), is validated through counterfactual visualizations, and demonstrates robust attribution across datasets and inference settings, underscoring its practical impact for transparent and fair use of diffusion models.

Abstract

As diffusion models become increasingly popular, the misuse of copyrighted and private images has emerged as a major concern. One promising solution to mitigate this issue is identifying the contribution of specific training samples in generative models, a process known as data attribution. Existing data attribution methods for diffusion models typically quantify the contribution of a training sample by evaluating the change in diffusion loss when the sample is included or excluded from the training process. However, we argue that the direct usage of diffusion loss cannot represent such a contribution accurately due to the calculation of diffusion loss. Specifically, these approaches measure the divergence between predicted and ground truth distributions, which leads to an indirect comparison between the predicted distributions and cannot represent the variances between model behaviors. To address these issues, we aim to measure the direct comparison between predicted distributions with an attribution score to analyse the training sample importance, which is achieved by Diffusion Attribution Score (\textit{DAS}). Underpinned by rigorous theoretical analysis, we elucidate the effectiveness of DAS. Additionally, we explore strategies to accelerate DAS calculations, facilitating its application to large-scale diffusion models. Our extensive experiments across various datasets and diffusion models demonstrate that DAS significantly surpasses previous benchmarks in terms of the linear data-modelling score, establishing new state-of-the-art performance. Code is available at \hyperlink{here}{https://github.com/Jinxu-Lin/DAS}.

Paper Structure

This paper contains 36 sections, 42 equations, 11 figures, 7 tables, 1 algorithm.

Figures (11)

  • Figure 1: We conduct an visualization experiment to explore DAS effectiveness described in Sec \ref{['subsec: counter factual visualization evaluation']}. Removing the influential samples identified by DAS produces the most significant differences in the generated images after retraining the model. DAS is the most effective methods for attribution.
  • Figure 2: The LDS(%) on CIFAR-2 under different projection dimension $k$. We consider 10 and 100 timesteps selected to be evenly spaced within the interval $[1, T]$, which are used to approximate the expectation $\mathbb{E}_t$. For each sampled timestep, we sample one standard Gaussian noise $\epsilon\sim\mathcal{N}(\epsilon|0, I)$ to approximate the expectation $\mathbb{E}_\epsilon$.
  • Figure 3: The LDS(%) on CIFAR-2 varies across different checkpoints. We analyze the data using 10 and 100 timesteps, evenly spaced within the interval $[1,T]$, to approximate the expectation $\mathbb{E}_t$. At each sampled timestep, we introduce one standard Gaussian noise $\boldsymbol{\epsilon}\sim\mathcal{N}(\mathbf{0},\mathbf{I})$ to approximate the expectation $\mathbb{E}_\epsilon$. We set the projection dimension $k=32768$.
  • Figure 4: LDS (%) on CIFAR-2 under different $\lambda$. We consider 10, 100, and 1000 timesteps selected to be evenly spaced within the interval $[1, T]$, which are used to approximate the expectation $\mathbb{E}_t$. We set $k = 4096$.
  • Figure 5: LDS (%) on ArtBench-2 under different $\lambda$. We consider 10 and 100 timesteps selected to be evenly spaced within the interval $[1, T]$, which are used to approximate the expectation $\mathbb{E}_t$. We set $k = 32768$.
  • ...and 6 more figures