Table of Contents
Fetching ...

FAID: Fine-Grained AI-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning

Minh Ngoc Ta, Dong Cao Van, Duc-Anh Hoang, Minh Le-Anh, Truong Nguyen, My Anh Tran Nguyen, Yuxia Wang, Preslav Nakov, Sang Dinh

TL;DR

FAID addresses the need for fine-grained detection of human-written, LLM-generated, and human–LLM collaborative texts in multilingual, multi-domain settings. It introduces FAIDSet, a large multilingual dataset spanning English and Vietnamese across academic domains, and FAID, a framework that combines multi-level contrastive learning with multi-task auxiliary learning to learn author-like distinctions for LLM families. FAID leverages an encoder (XLM-RoBERTa) trained with a multi-level contrastive loss and an auxiliary human-vs-LLM classifier, while handling unseen data at inference through a vector database and fuzzy kNN without retraining. Empirical results show FAID outperforms strong baselines in both in-domain and challenging unseen-domain/generator scenarios, demonstrates robust real-world generalization, and enables generator attribution, offering practical benefits for transparency and accountability in AI-assisted writing.

Abstract

The growing collaboration between humans and AI models in generative tasks has introduced new challenges in distinguishing between human-written, LLM-generated, and human--LLM collaborative texts. In this work, we collect a multilingual, multi-domain, multi-generator dataset FAIDSet. We further introduce a fine-grained detection framework FAID to classify text into these three categories, and also to identify the underlying LLM family of the generator. Unlike existing binary classifiers, FAID is built to capture both authorship and model-specific characteristics. Our method combines multi-level contrastive learning with multi-task auxiliary classification to learn subtle stylistic cues. By modeling LLM families as distinct stylistic entities, we incorporate an adaptation to address distributional shifts without retraining for unseen data. Our experimental results demonstrate that FAID outperforms several baselines, particularly enhancing the generalization accuracy on unseen domains and new LLMs, thus offering a potential solution for improving transparency and accountability in AI-assisted writing.

FAID: Fine-Grained AI-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning

TL;DR

FAID addresses the need for fine-grained detection of human-written, LLM-generated, and human–LLM collaborative texts in multilingual, multi-domain settings. It introduces FAIDSet, a large multilingual dataset spanning English and Vietnamese across academic domains, and FAID, a framework that combines multi-level contrastive learning with multi-task auxiliary learning to learn author-like distinctions for LLM families. FAID leverages an encoder (XLM-RoBERTa) trained with a multi-level contrastive loss and an auxiliary human-vs-LLM classifier, while handling unseen data at inference through a vector database and fuzzy kNN without retraining. Empirical results show FAID outperforms strong baselines in both in-domain and challenging unseen-domain/generator scenarios, demonstrates robust real-world generalization, and enables generator attribution, offering practical benefits for transparency and accountability in AI-assisted writing.

Abstract

The growing collaboration between humans and AI models in generative tasks has introduced new challenges in distinguishing between human-written, LLM-generated, and human--LLM collaborative texts. In this work, we collect a multilingual, multi-domain, multi-generator dataset FAIDSet. We further introduce a fine-grained detection framework FAID to classify text into these three categories, and also to identify the underlying LLM family of the generator. Unlike existing binary classifiers, FAID is built to capture both authorship and model-specific characteristics. Our method combines multi-level contrastive learning with multi-task auxiliary classification to learn subtle stylistic cues. By modeling LLM families as distinct stylistic entities, we incorporate an adaptation to address distributional shifts without retraining for unseen data. Our experimental results demonstrate that FAID outperforms several baselines, particularly enhancing the generalization accuracy on unseen domains and new LLMs, thus offering a potential solution for improving transparency and accountability in AI-assisted writing.

Paper Structure

This paper contains 50 sections, 10 equations, 5 figures, 11 tables.

Figures (5)

  • Figure 1: Training architecture. Leveraging multi-level contrastive learning loss, we fine-tune a language model (we select XLM-RoBERTa ethroberta, see Appendix \ref{['sec:model-select']}) based on the human, human--LLM and LLM-generated texts, to force the model to reorganize the hidden space, pulling the embeddings within the same author families closer, and pushing the embeddings from different authors farther. We train an encoder that can represent text with distinguishable signals to discern authorship of text.
  • Figure 2: Inference architecture: (a) embed the input text into embedding vector using the fine-tuned encoder, (b) use Fuzzy kNN to cluster, retrieving which cluster the input text belongs to (see more in Appendix \ref{['sec:ablation-study']}), (c) the stored vector database $\mathcal{VD}$ was created by saving all embeddings of texts in training and validation sets using the fine-tuned encoder. If the input text is unseen, we embed it and save it into a temporary vector database $\mathcal{VD}'$, enhancing the generalization of the detector.
  • Figure 3: Text length distributions in words and characters across Llama-3.3, GPT-4o/4o-mini, Gemini 2.0, Gemini 2.0 Flash-Lite, and Gemini 1.5 Flash.
  • Figure 4: Visualizations showing clustering behavior of Gemini model family (Gemini 2.0, Gemini 2.0 Flash-Lite, Gemini 1.5 Flash) and GPT-4o/4o-mini using 2D and 3D embeddings with sample size of 2000 texts.
  • Figure 5: Top 20 most common trigrams from Gemini 2.0, Gemini 2.0 Flash-Lite, Gemini 1.5 Flash using 500 sample prompts.