Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models
Avinash Trivedi, Sangeetha Sivanesan
TL;DR
The paper tackles robust AI-generated text detection under Defactify 4.0 by combining noise-driven data augmentation with an ensemble of DeBERTa models. Baseline experiments show DeBERTa-v3-small performs best for distinguishing AI vs. human text, but Task-B suffers from overfitting, prompting a noise-augmented training regime. The authors demonstrate that injecting controlled noise (10% junk words) and ensembling models trained on original and noisy data yields top performance, achieving Task-A F1 of 1.0 and Task-B F1 of 0.9531. The work highlights noise-based regularization as an effective strategy for robust AI-generated text detection and provides a blueprint for future defenses against increasingly sophisticated LLMs.
Abstract
This paper presents an effective approach to detect AI-generated text, developed for the Defactify 4.0 shared task at the fourth workshop on multimodal fact checking and hate speech detection. The task consists of two subtasks: Task-A, classifying whether a text is AI generated or human written, and Task-B, classifying the specific large language model that generated the text. Our team (Sarang) achieved the 1st place in both tasks with F1 scores of 1.0 and 0.9531, respectively. The methodology involves adding noise to the dataset to improve model robustness and generalization. We used an ensemble of DeBERTa models to effectively capture complex patterns in the text. The result indicates the effectiveness of our noise-driven and ensemble-based approach, setting a new standard in AI-generated text detection and providing guidance for future developments.
