Table of Contents
Fetching ...

Intriguing Properties of Data Attribution on Diffusion Models

Xiaosen Zheng, Tianyu Pang, Chao Du, Jing Jiang, Min Lin

TL;DR

This work presents a significantly more efficient approach for attributing diffusion models, while the unexpected findings suggest that at least in non-convex settings, constructions guided by theoretical assumptions may lead to inferior attribution performance.

Abstract

Data attribution seeks to trace model outputs back to training data. With the recent development of diffusion models, data attribution has become a desired module to properly assign valuations for high-quality or copyrighted training samples, ensuring that data contributors are fairly compensated or credited. Several theoretically motivated methods have been proposed to implement data attribution, in an effort to improve the trade-off between computational scalability and effectiveness. In this work, we conduct extensive experiments and ablation studies on attributing diffusion models, specifically focusing on DDPMs trained on CIFAR-10 and CelebA, as well as a Stable Diffusion model LoRA-finetuned on ArtBench. Intriguingly, we report counter-intuitive observations that theoretically unjustified design choices for attribution empirically outperform previous baselines by a large margin, in terms of both linear datamodeling score and counterfactual evaluation. Our work presents a significantly more efficient approach for attributing diffusion models, while the unexpected findings suggest that at least in non-convex settings, constructions guided by theoretical assumptions may lead to inferior attribution performance. The code is available at https://github.com/sail-sg/D-TRAK.

Intriguing Properties of Data Attribution on Diffusion Models

TL;DR

This work presents a significantly more efficient approach for attributing diffusion models, while the unexpected findings suggest that at least in non-convex settings, constructions guided by theoretical assumptions may lead to inferior attribution performance.

Abstract

Data attribution seeks to trace model outputs back to training data. With the recent development of diffusion models, data attribution has become a desired module to properly assign valuations for high-quality or copyrighted training samples, ensuring that data contributors are fairly compensated or credited. Several theoretically motivated methods have been proposed to implement data attribution, in an effort to improve the trade-off between computational scalability and effectiveness. In this work, we conduct extensive experiments and ablation studies on attributing diffusion models, specifically focusing on DDPMs trained on CIFAR-10 and CelebA, as well as a Stable Diffusion model LoRA-finetuned on ArtBench. Intriguingly, we report counter-intuitive observations that theoretically unjustified design choices for attribution empirically outperform previous baselines by a large margin, in terms of both linear datamodeling score and counterfactual evaluation. Our work presents a significantly more efficient approach for attributing diffusion models, while the unexpected findings suggest that at least in non-convex settings, constructions guided by theoretical assumptions may lead to inferior attribution performance. The code is available at https://github.com/sail-sg/D-TRAK.
Paper Structure (29 sections, 18 equations, 24 figures, 8 tables)

This paper contains 29 sections, 18 equations, 24 figures, 8 tables.

Figures (24)

  • Figure 1: LDS (%) on CIFAR-2, where $\phi^{s}$ is constructed by the interpolation described in Section \ref{['sec32']} for $\eta\in[0,1]$. The experimental setup employed is identical to that outlined in Table \ref{['tab:CIFAR2-f']}. The three subplots are associated with $10$, $100$, and $1000$ timesteps selected to be evenly spaced within the interval $[1,T]$, respectively, which are used to approximate the expectation $\mathbb{E}_{t}$ over $t\sim\mathcal{U}([1,T])$.
  • Figure 2: The LDS(%) on the generation set of (Top) CIFAR-2 and (Bottom) Artbench-2 using checkpoints of different epochs. We select $10$, $100$, and $1000$ timesteps evenly spaced within the interval $[1,T]$ to approximate $\mathbb{E}_{t}$. For each selected timestep, we sample one standard Gaussian noise to approximate $\mathbb{E}_{\boldsymbol{\epsilon}}$. We set $k=32768$. For the full results, please check Figures \ref{['tab:CIFAR2-ckpt']}, \ref{['tab:CelebA-ckpt']} and \ref{['tab:ArtBench-ckpt']} in Appendix \ref{['appendix:ablation']}.
  • Figure 3: Boxplots of counterfactual evaluation on CIFAR-2 and ArtBench-2. We quantify the impact of removing the 1,000 highest-scoring training samples and re-training according to Random, TRAK, and D-TRAK. We measure the pixel-wise $\ell_2$-distance and CLIP cosine similarity between 60 synthesized samples and corresponding images generated by the re-trained models when sampling from the same random seed. For the results on CelebA, please check Figure \ref{['tab:counter-eval-celeba']} in Appendix \ref{['appendix:additional_exp:counter']}.
  • Figure 4: Counterfactual visualization on (Top) CIFAR-2 and (Bottom) ArtBench-2. We compare the samples to those generated by retrained models using the same seed. See Appendix \ref{['appendix:counter_viz']} for more cases.
  • Figure 5: The LDS(%) on CIFAR-2 under different $k$. The number of timesteps is in the parentheses. We consider $10$ and $100$ timesteps selected to be evenly spaced within the interval $[1,T]$, which are used to approximate the expectation $\mathbb{E}_{t}$. For each sampled timestep, we sample one standard Gaussian noise $\boldsymbol{\epsilon}\sim\mathcal{N}(\boldsymbol{\epsilon}|\mathbf{0},\mathbf{I})$ to approximate the expectation $\mathbb{E}_{\boldsymbol{\epsilon}}$.
  • ...and 19 more figures

Theorems & Definitions (2)

  • Definition 1: Data attribution
  • Definition 2: Linear datamodeling score