Preventing Model Collapse via Contraction-Conditioned Neural Filters
Zongjian Han, Yiran Liang, Ruiwen Wang, Yiwei Luo, Yilin Huang, Xiaotong Song, Dongqing Wei
TL;DR
The paper tackles model-collapse risk in recursive training with self-generated synthetic data by reframing estimation error as a nonlinear stochastic process and enforcing a contraction condition via a learned neural filter. It introduces a contraction-operator framework, incorporating a Lyapunov-based contraction trigger and a data-filter network that selects informative samples to ensure convergence even with constant sample sizes. The authors prove probabilistic convergence guarantees (Theorems 1 and 3) and demonstrate empirically that the contraction-conditioned filter stabilizes parameter estimates and prevents collapse, outperforming the prior superlinear-sample-growth approach. Practical implications are highlighted for large language models, data augmentation, privacy-preserving synthesis, and continual learning, where stable long-term training is essential.
Abstract
This paper presents a neural network filter method based on contraction operators to address model collapse in recursive training of generative models. Unlike \cite{xu2024probabilistic}, which requires superlinear sample growth ($O(t^{1+s})$), our approach completely eliminates the dependence on increasing sample sizes within an unbiased estimation framework by designing a neural filter that learns to satisfy contraction conditions. We develop specialized neural network architectures and loss functions that enable the filter to actively learn contraction conditions satisfying Assumption 2.3 in exponential family distributions, thereby ensuring practical application of our theoretical results. Theoretical analysis demonstrates that when the learned contraction conditions are satisfied, estimation errors converge probabilistically even with constant sample sizes, i.e., $\limsup_{t\to\infty}\mathbb{P}(\|\mathbf{e}_t\|>δ)=0$ for any $δ>0$. Experimental results show that our neural network filter effectively learns contraction conditions and prevents model collapse under fixed sample size settings, providing an end-to-end solution for practical applications.
