Table of Contents
Fetching ...

Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection

Zhixin Lai, Xuesheng Zhang, Suiyao Chen

TL;DR

The paper tackles the problem of detecting text generated by large language models and the limited generalization of single transformer classifiers to out-of-distribution data. It compares five fine-tuned, frozen-backbone transformer detectors and couples them with both non-adaptive and adaptive ensemble methods. The results show that adaptive ensembles markedly improve in-distribution accuracy (up to 99.2%) and generalization to out-of-distribution data (up to 0.736 accuracy), outperforming single models and non-adaptive ensembles. This approach offers a robust, scalable solution for reliable LLM-generated text detection in diverse data contexts.

Abstract

Large language models (LLMs) have reached human-like proficiency in generating diverse textual content, underscoring the necessity for effective fake text detection to avoid potential risks such as fake news in social media. Previous research has mostly tested single models on in-distribution datasets, limiting our understanding of how these models perform on different types of data for LLM-generated text detection task. We researched this by testing five specialized transformer-based models on both in-distribution and out-of-distribution datasets to better assess their performance and generalizability. Our results revealed that single transformer-based classifiers achieved decent performance on in-distribution dataset but limited generalization ability on out-of-distribution dataset. To improve it, we combined the individual classifiers models using adaptive ensemble algorithms, which improved the average accuracy significantly from 91.8% to 99.2% on an in-distribution test set and from 62.9% to 72.5% on an out-of-distribution test set. The results indicate the effectiveness, good generalization ability, and great potential of adaptive ensemble algorithms in LLM-generated text detection.

Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection

TL;DR

The paper tackles the problem of detecting text generated by large language models and the limited generalization of single transformer classifiers to out-of-distribution data. It compares five fine-tuned, frozen-backbone transformer detectors and couples them with both non-adaptive and adaptive ensemble methods. The results show that adaptive ensembles markedly improve in-distribution accuracy (up to 99.2%) and generalization to out-of-distribution data (up to 0.736 accuracy), outperforming single models and non-adaptive ensembles. This approach offers a robust, scalable solution for reliable LLM-generated text detection in diverse data contexts.

Abstract

Large language models (LLMs) have reached human-like proficiency in generating diverse textual content, underscoring the necessity for effective fake text detection to avoid potential risks such as fake news in social media. Previous research has mostly tested single models on in-distribution datasets, limiting our understanding of how these models perform on different types of data for LLM-generated text detection task. We researched this by testing five specialized transformer-based models on both in-distribution and out-of-distribution datasets to better assess their performance and generalizability. Our results revealed that single transformer-based classifiers achieved decent performance on in-distribution dataset but limited generalization ability on out-of-distribution dataset. To improve it, we combined the individual classifiers models using adaptive ensemble algorithms, which improved the average accuracy significantly from 91.8% to 99.2% on an in-distribution test set and from 62.9% to 72.5% on an out-of-distribution test set. The results indicate the effectiveness, good generalization ability, and great potential of adaptive ensemble algorithms in LLM-generated text detection.
Paper Structure (15 sections, 5 figures, 4 tables)

This paper contains 15 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Datasets topic distribution
  • Figure 2: Dataset part-of-speech tag.
  • Figure 3: Structure of single classifer detection.
  • Figure 4: Structure of assemble detection.
  • Figure 5: Average accuracy of different methods.