Table of Contents
Fetching ...

A Markovian View of Iterative-Feedback Loops in Image Generative Models: Neural Resonance and Model Collapse

Vibhas Kumar Vats, David J. Crandall, Samuel Goree

TL;DR

Neural resonance provides a unified explanation for long-term degenerate behavior in generative models and provides practical diagnostics for identifying, characterizing, and eventually mitigating collapse.

Abstract

AI training datasets will inevitably contain AI-generated examples, leading to ``feedback'' in which the output of one model impacts the training of another. It is known that such iterative feedback can lead to model collapse, yet the mechanisms underlying this degeneration remain poorly understood. Here we show that a broad class of feedback processes converges to a low-dimensional invariant structure in latent space, a phenomenon we call neural resonance. By modeling iterative feedback as a Markov Chain, we show that two conditions are needed for this resonance to occur: ergodicity of the feedback process and directional contraction of the latent representation. By studying diffusion models on MNIST and ImageNet, as well as CycleGAN and an audio feedback experiment, we map how local and global manifold geometry evolve, and we introduce an eight-pattern taxonomy of collapse behaviors. Neural resonance provides a unified explanation for long-term degenerate behavior in generative models and provides practical diagnostics for identifying, characterizing, and eventually mitigating collapse.

A Markovian View of Iterative-Feedback Loops in Image Generative Models: Neural Resonance and Model Collapse

TL;DR

Neural resonance provides a unified explanation for long-term degenerate behavior in generative models and provides practical diagnostics for identifying, characterizing, and eventually mitigating collapse.

Abstract

AI training datasets will inevitably contain AI-generated examples, leading to ``feedback'' in which the output of one model impacts the training of another. It is known that such iterative feedback can lead to model collapse, yet the mechanisms underlying this degeneration remain poorly understood. Here we show that a broad class of feedback processes converges to a low-dimensional invariant structure in latent space, a phenomenon we call neural resonance. By modeling iterative feedback as a Markov Chain, we show that two conditions are needed for this resonance to occur: ergodicity of the feedback process and directional contraction of the latent representation. By studying diffusion models on MNIST and ImageNet, as well as CycleGAN and an audio feedback experiment, we map how local and global manifold geometry evolve, and we introduce an eight-pattern taxonomy of collapse behaviors. Neural resonance provides a unified explanation for long-term degenerate behavior in generative models and provides practical diagnostics for identifying, characterizing, and eventually mitigating collapse.
Paper Structure (43 sections, 6 theorems, 24 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 43 sections, 6 theorems, 24 equations, 11 figures, 3 tables, 1 algorithm.

Key Result

Lemma C.1

One-step diffusion-based generative Markov kernel has positive density; i.e., the one-step density $K(x,y)$ of $K(x, \cdot)$ satisfies $K(x,y) > 0$ for all $y \in \mathbf{E}$. Therefore, for any $A$ with $\psi(A) > 0$,

Figures (11)

  • Figure 1: A high-level representation of the Iterative feedback process. (a) A graphical representation of the generational Markov chain and cartoon overview of our iterative generation settings. $X_n$ represents the current distribution of images at generation $n$. (b) Data distribution distances for two experiments in scenario (3), and three experiments in scenario (4), measured using FID score. The top plot shows single-step changes between consecutive generations, while the bottom plot shows cumulative change from the original distribution. Experimental runs with differing numbers of generations are scaled to the width of the graph, so that the x-axis represents fraction of total generations, and each point is one generation.
  • Figure 2: Evolution of class exemplars under iterative feedback. Columns: same-generation samples. (a–c) MNIST: latent-feedback progressively loses semantic coherence; label-guided and unconditional retraining retain class identity but converge to repetitive templates owing to high compressibility. (d–e) ImageNet-5: both regimes collapse, but differently: latent-feedback preserves coarse class cues while producing entangled objects; label-guided retraining loses most class semantics. (f) CycleGAN (non-ergodic): trajectories settle into distinct attractor basins between the two domains. Random examples from the same class are shown for each scenario.
  • Figure 3: A idealized example of local and cumulative drifts in generational Markov chains. The blue curve ($FID_{n,n-1}$) shows local drift between successive generations, and the orange curve ($FID_{n,0}$) shows cumulative drift from the original data. When both curves have steep slopes, the chain is in an active transient phase; when one curve flattens and the other changes slowly, it is in a slow transient phase; and when both plateau, the system has reached empirical stationarity.
  • Figure 4: A high-level representation of the Iterative feedback process.$X_n$ represents the current distribution of images at generation $n$ and $T(\cdot)$ is the transformation operator. The framework unifies five feedback settings— Lucier’s acoustic experiment, cyclic image translation (CycleGAN), latent-feedback diffusion, and two retrained diffusion models (label-guided and unconditional)— each representing a distinct form of iterative feedback. Together they illustrate how successive generations of models form a Markov process and their shared dynamics.
  • Figure 5: Local and cumulative drifts in generational Markov chains on MNIST. Each plot shows the local drift ($FID_{n,n-1}$, top) and cumulative drift ($FID_{n,0}$, bottom) across generations. Results are shown for the label-guided retrained diffusion model (top left), the unconditional retrained diffusion model (top right), and the latent-feedback diffusion model (bottom). Together they illustrate distinct convergence behaviors— rapid stabilization in the label-guided case, continued evolution in the unconditional model, and gradual stationarity in the latent-feedback chain.
  • ...and 6 more figures

Theorems & Definitions (12)

  • Definition 1: $\psi$-irreducible
  • Lemma C.1: One-step positive density
  • proof
  • Lemma C.2: Local self-transition
  • proof
  • Lemma C.3: Small set in finite space
  • Lemma C.4: Petite set minorization
  • Lemma C.5: $\psi$-irreducible + aperiodicity + petite set $\Rightarrow$ Harris recurrence
  • Definition 2: Total variation (TV) norm ($\lVert \cdot \rVert_{TV}$)
  • Corollary 1: Ergodic GMCs
  • ...and 2 more