Automated Text Identification Using CNN and Training Dynamics
Claudiu Creanga, Liviu Petrisor Dinu
TL;DR
The paper addresses the challenge of distinguishing human- versus AI-generated text and improving generalization to unseen writing styles. It leverages Data Maps to analyze training dynamics of a CNN-based classifier on the AuTexTification dataset, identifying three learnability regions and demonstrating that training on ambiguous examples can boost out-of-distribution performance. Empirical results show that focusing on ambiguous samples yields higher F1 scores (around 0.66) than training on the full dataset, suggesting data selection strategies that emphasize ambiguity can enhance robustness to novel domains. The work informs dataset design and training strategies for AI-generated text detection, with practical implications for mitigating misinformation and ensuring ethical AI use.
Abstract
We used Data Maps to model and characterize the AuTexTification dataset. This provides insights about the behaviour of individual samples during training across epochs (training dynamics). We characterized the samples across 3 dimensions: confidence, variability and correctness. This shows the presence of 3 regions: easy-to-learn, ambiguous and hard-to-learn examples. We used a classic CNN architecture and found out that training the model only on a subset of ambiguous examples improves the model's out-of-distribution generalization.
