Convergence and Stability Analysis of Self-Consuming Generative Models with Heterogeneous Human Curation
Hongru Zhao, Jinwen Fu, Tuan Pham
TL;DR
The paper analyzes convergence and stability of self-consuming generative models under heterogeneous human preferences, modeling the retraining loop as population-level updates with a finite or infinite candidate pool and an anchoring regularization parameter $\alpha$ that blends curated data with a reference distribution. It derives closed-form population updates across four regimes, proving KL-convergence in the pure synthetic case and contraction results (in TV or Hilbert projective metrics) when regularization is present, with an explicit fixed point structure in the infinite-pool regime. The main contributions are (i) a heterogeneous random-utility curation model, (ii) exact dynamics in four regimes, (iii) convergence guarantees and stability analyses under reward perturbations, and (iv) design guidance emphasizing the stabilizing role of reference data in curate-and-retrain loops. These results have practical implications for alignment pipelines (e.g., DPO/RLHF-style methods) by clarifying when and how anchor regularization ensures robust convergence and robustness to misspecified feedback.
Abstract
Self-consuming generative models have received significant attention over the last few years. In this paper, we study a self-consuming generative model with heterogeneous preferences that is a generalization of the model in Ferbach et al. (2024). The model is retrained round by round using real data and its previous-round synthetic outputs. The asymptotic behavior of the retraining dynamics is investigated across four regimes using different techniques including the nonlinear Perron--Frobenius theory. Our analyses improve upon that of Ferbach et al. (2024) and provide convergence results in settings where the well-known Banach contraction mapping arguments do not apply. Stability and non-stability results regarding the retraining dynamics are also given.
