Table of Contents
Fetching ...

ForTIFAI: Fending Off Recursive Training Induced Failure for AI Model Collapse

Soheil Zibakhsh Shabgahi, Pedram Aghazadeh, Azalia Mirhoseini, Farinaz Koushanfar

TL;DR

This work tackles the problem of model collapse in an era of rising synthetic data by introducing Truncated Cross Entropy (TCE), a confidence-aware loss that masks high-confidence predictions to preserve tail diversity during recursive training. The authors demonstrate that TCE substantially delays collapse, enabling models to tolerate over 2x more synthetic data and maintaining knowledge retention and distributional fidelity across multiple architectures and datasets. A comprehensive evaluation framework simulates continual synthetic data accumulation and measures time-to-failure, KR-test accuracy, and KL divergence to validate robustness. The findings suggest that simple, loss-based interventions can significantly extend the practical lifetime of generative models in synthetic-data-dominated regimes, with broad applicability beyond language modeling to GMMs and VAEs as well.

Abstract

The increasing reliance on generative AI models is rapidly increasing the volume of synthetic data, with some projections suggesting that most available new data for training could be machine-generated by 2030. This shift to a mainly synthetic content presents a critical challenge: repeated training in synthetic data leads to a phenomenon known as model collapse, where model performance degrades over generations of training, eventually rendering the models ineffective. While the causes of model collapse are increasingly understood, effective mitigation strategies remain scarce. We address this challenge by leveraging a key insight: auto-regressive models tend to generate text sequences to which they assign high confidence (i.e., high log-likelihood). Based on this observation, we introduce the Truncated-Cross-Entropy (TCE) loss function. TCE mitigates collapse by selectively ignoring high-confidence tokens during training, effectively filtering out likely machine-generated artifacts from the learning process. Our experiments demonstrate that models trained with TCE not only learn effectively but also exhibit significantly increased resilience, tolerating over 2.3x more synthetic data before the onset of collapse. In addition, we provide an open-source benchmark for collapse dynamics in mixed-data settings. Our results demonstrate that confidence-aware training objectives can substantially delay collapse onset, offering a practical and generalizable tool for model robustness under synthetic-data exposure.

ForTIFAI: Fending Off Recursive Training Induced Failure for AI Model Collapse

TL;DR

This work tackles the problem of model collapse in an era of rising synthetic data by introducing Truncated Cross Entropy (TCE), a confidence-aware loss that masks high-confidence predictions to preserve tail diversity during recursive training. The authors demonstrate that TCE substantially delays collapse, enabling models to tolerate over 2x more synthetic data and maintaining knowledge retention and distributional fidelity across multiple architectures and datasets. A comprehensive evaluation framework simulates continual synthetic data accumulation and measures time-to-failure, KR-test accuracy, and KL divergence to validate robustness. The findings suggest that simple, loss-based interventions can significantly extend the practical lifetime of generative models in synthetic-data-dominated regimes, with broad applicability beyond language modeling to GMMs and VAEs as well.

Abstract

The increasing reliance on generative AI models is rapidly increasing the volume of synthetic data, with some projections suggesting that most available new data for training could be machine-generated by 2030. This shift to a mainly synthetic content presents a critical challenge: repeated training in synthetic data leads to a phenomenon known as model collapse, where model performance degrades over generations of training, eventually rendering the models ineffective. While the causes of model collapse are increasingly understood, effective mitigation strategies remain scarce. We address this challenge by leveraging a key insight: auto-regressive models tend to generate text sequences to which they assign high confidence (i.e., high log-likelihood). Based on this observation, we introduce the Truncated-Cross-Entropy (TCE) loss function. TCE mitigates collapse by selectively ignoring high-confidence tokens during training, effectively filtering out likely machine-generated artifacts from the learning process. Our experiments demonstrate that models trained with TCE not only learn effectively but also exhibit significantly increased resilience, tolerating over 2.3x more synthetic data before the onset of collapse. In addition, we provide an open-source benchmark for collapse dynamics in mixed-data settings. Our results demonstrate that confidence-aware training objectives can substantially delay collapse onset, offering a practical and generalizable tool for model robustness under synthetic-data exposure.

Paper Structure

This paper contains 32 sections, 10 equations, 13 figures, 10 tables, 2 algorithms.

Figures (13)

  • Figure 1: A model exhibits significantly higher confidence in its own generated data compared to unseen data. The histograms show token probabilities from a base LLaMA-3.2-1B model (blue) and a fine-tuned version (orange). When evaluating its own synthetic generations (right column), the fine-tuned model shows a clear shift toward higher confidence compared to its evaluation on unseen data (left column). In contrast, the base model's confidence distribution remains consistent across both datasets, identifying elevated confidence as a sign of self-generated text.
  • Figure 2: Effect of TCE on a one-dimensional Gaussian estimator under a fully synthetic training loop. In each generation, $10,000$ samples are generated with sampling bias $\gamma=0.9$. Standard estimation with Cross Entropy (CE) leads to rapid variance collapse. By contrast, TCE with a well-chosen threshold ($\gamma=0.85$) delays collapse substantially. Conversely, if the threshold is set too small ($\gamma=0.80$), the variance may diverge.
  • Figure 3: Our experimental setup simulates model collapse, illustrating the transition from predominantly human-generated content to mostly synthetic datasets. $M_i$ denotes the $i$-th generation of recursively trained models. The clean data is initially split into $N$ equally sized splits. At each generation, the entire dataset from the previous iteration is regenerated and used as training data for the next iteration. One split of clean data is added to this training dataset to simulate the accumulation of non-generated data into the training set. The red color for each data split indicates the contamination from recursively training on the self-generated data, resulting in lower quality (darker red color)
  • Figure 4: Total knowledge retention (KR-total) accuracy with LLaMA-3.2-1B model trained on Wikitext. TCE retains learning capability and even exceed the baseline throughout different stages.
  • Figure 5: Our proposed solution outperforms baselines (CE) across all benchmarks consistently when experiment is done on both datasets. Figures (a) and (b) show the results of LLaMA-3.2-1B across different benchmarks.
  • ...and 8 more figures