Table of Contents
Fetching ...

Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences

Damien Ferbach, Quentin Bertrand, Avishek Joey Bose, Gauthier Gidel

TL;DR

The paper tackles how data curation of synthetic content, as performed by humans, shapes the self-consumption retraining loop of generative models. It models curation via a discrete choice framework, showing that in the limit of many comparisons (K → ∞) the update mirrors RLHF, with p_{t+1}(x) ≈ p_t(x) e^{r(x)}/E_{p_t}[e^{r}] and the reward dynamics driving convergence toward the highest reward level r_* under suitable assumptions. When real data is intermittently injected (finite λ), the authors prove stability and show KL convergence to the data-optimal distribution in favorable neighborhoods, while the reward monotonically increases and remains bounded relative to a reference distribution; they also connect these dynamics to RLHF and observe reward-driven bias amplification in experiments. Across synthetic experiments and CIFAR-10, the results illustrate that curated data can amplify certain preferences or biases, particularly under classifier-based rewards, underscoring practical and societal implications for web-scale data curation and model alignment.

Abstract

The rapid progress in generative models has resulted in impressive leaps in generation quality, blurring the lines between synthetic and real data. Web-scale datasets are now prone to the inevitable contamination by synthetic data, directly impacting the training of future generated models. Already, some theoretical results on self-consuming generative models (a.k.a., iterative retraining) have emerged in the literature, showcasing that either model collapse or stability could be possible depending on the fraction of generated data used at each retraining step. However, in practice, synthetic data is often subject to human feedback and curated by users before being used and uploaded online. For instance, many interfaces of popular text-to-image generative models, such as Stable Diffusion or Midjourney, produce several variations of an image for a given query which can eventually be curated by the users. In this paper, we theoretically study the impact of data curation on iterated retraining of generative models and show that it can be seen as an \emph{implicit preference optimization mechanism}. However, unlike standard preference optimization, the generative model does not have access to the reward function or negative samples needed for pairwise comparisons. Moreover, our study doesn't require access to the density function, only to samples. We prove that, if the data is curated according to a reward model, then the expected reward of the iterative retraining procedure is maximized. We further provide theoretical results on the stability of the retraining loop when using a positive fraction of real data at each step. Finally, we conduct illustrative experiments on both synthetic datasets and on CIFAR10 showing that such a procedure amplifies biases of the reward model.

Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences

TL;DR

The paper tackles how data curation of synthetic content, as performed by humans, shapes the self-consumption retraining loop of generative models. It models curation via a discrete choice framework, showing that in the limit of many comparisons (K → ∞) the update mirrors RLHF, with p_{t+1}(x) ≈ p_t(x) e^{r(x)}/E_{p_t}[e^{r}] and the reward dynamics driving convergence toward the highest reward level r_* under suitable assumptions. When real data is intermittently injected (finite λ), the authors prove stability and show KL convergence to the data-optimal distribution in favorable neighborhoods, while the reward monotonically increases and remains bounded relative to a reference distribution; they also connect these dynamics to RLHF and observe reward-driven bias amplification in experiments. Across synthetic experiments and CIFAR-10, the results illustrate that curated data can amplify certain preferences or biases, particularly under classifier-based rewards, underscoring practical and societal implications for web-scale data curation and model alignment.

Abstract

The rapid progress in generative models has resulted in impressive leaps in generation quality, blurring the lines between synthetic and real data. Web-scale datasets are now prone to the inevitable contamination by synthetic data, directly impacting the training of future generated models. Already, some theoretical results on self-consuming generative models (a.k.a., iterative retraining) have emerged in the literature, showcasing that either model collapse or stability could be possible depending on the fraction of generated data used at each retraining step. However, in practice, synthetic data is often subject to human feedback and curated by users before being used and uploaded online. For instance, many interfaces of popular text-to-image generative models, such as Stable Diffusion or Midjourney, produce several variations of an image for a given query which can eventually be curated by the users. In this paper, we theoretically study the impact of data curation on iterated retraining of generative models and show that it can be seen as an \emph{implicit preference optimization mechanism}. However, unlike standard preference optimization, the generative model does not have access to the reward function or negative samples needed for pairwise comparisons. Moreover, our study doesn't require access to the density function, only to samples. We prove that, if the data is curated according to a reward model, then the expected reward of the iterative retraining procedure is maximized. We further provide theoretical results on the stability of the retraining loop when using a positive fraction of real data at each step. Finally, we conduct illustrative experiments on both synthetic datasets and on CIFAR10 showing that such a procedure amplifies biases of the reward model.
Paper Structure (29 sections, 19 theorems, 74 equations, 10 figures, 1 algorithm)

This paper contains 29 sections, 19 theorems, 74 equations, 10 figures, 1 algorithm.

Key Result

Lemma 2.1

Let $p_{t+1}$ be defined as in eq_pb_statment_infinite_lambda. If $\mathcal{P}= \mathcal{P}(\mathbb{R}^d)$ is the set of probability distributions on $\mathbb{R}^d$, and if we assume that $\mathbb{E}_{y \sim p_t}\left[e^{r(y)}\right]<\infty$, then we have for all $x \in {\mathds{R}}^d$,

Figures (10)

  • Figure 1: Illustration of the curation phenomenon: 1. User proposes prompts such as "butterfly going to the bathroom", 2. Four images are generated with Midjourney, 3. User only upscale one (e.g. the top left image) image, 4. Solely upscaled images are incorporated into the JourneyDB dataset journeydb. Samples from other diffusion models can be found in \ref{['fig_midjourney', 'fig_stable_diff']}.
  • Figure 2: CIFAR-10. Evolution of the proportion of the class 'Airplane' and of the $9$ other classes when filtering on curated synthetic samples with reward $r(x)=\gamma \cdot q_0(x)$
  • Figure 3: CIFAR-10. Evolution of the proportion of each class and the average reward $r(x)$ when filtering based on the confidence of a classifier. On the left, retraining is done solely on the curated synthetic samples which results in the emergence of proportion biases. On the right, retraining is performed on a mixture of real and curated synthetic samples which results in both increased stability and still reward augmentation.
  • Figure 4: Mixture of Gaussians. Iterative retraining on the two moons dataset for $5$ iterations. On the top row, we display the fully filtered synthetic loop, and below we use a mixture of real and filtered data.
  • Figure 5: Two moons. Iterative retraining on the two moons dataset for $5$ iterations. On the top row, we display the fully filtered synthetic loop, and below we use a mixture of real and filtered data.
  • ...and 5 more figures

Theorems & Definitions (32)

  • Lemma 2.1
  • Lemma 2.2
  • Theorem 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Theorem 2.4
  • Lemma A.1
  • proof
  • Lemma A.1
  • proof
  • ...and 22 more