VietBinoculars: A Zero-Shot Approach for Detecting Vietnamese LLM-Generated Text
Trieu Hai Nguyen, Sivaswamy Akilesh
TL;DR
VietBinoculars addresses the challenge of detecting Vietnamese LLM-generated text using a zero-shot extension of the Binoculars framework. It leverages two closely related Vietnamese LLMs, PhoGPT-4B and PhoGPT-4B-Chat, with a shared BPE tokenizer, to compute a robust detector score that normalizes perplexity via cross-perplexity. The approach achieves exceptional out-of-domain performance (often >99% accuracy, F1, and AUC) and outperforms commercial tools and several zero-shot baselines, including under challenging prompting regimes. Its practical impact lies in providing a language-specific, training-free detector that can be deployed with carefully chosen global thresholds, while acknowledging limitations such as domain coverage and computational overhead. The work also demonstrates the Capybara problem scenario where VietBinoculars substantially surpasses other detectors, underscoring the value of language-aware, zero-shot detection for Vietnamese content.
Abstract
The rapid development research of Large Language Models (LLMs) based on transformer architectures raises key challenges, one of them being the task of distinguishing between human-written text and LLM-generated text. As LLM-generated textual content, becomes increasingly complex over time, and resembles human writing, traditional detection methods are proving less effective, especially as the number and diversity of LLMs continue to grow with new models and versions being released at a rapid pace. This study proposes VietBinoculars, an adaptation of the Binoculars method with optimized global thresholds, to enhance the detection of Vietnamese LLM-generated text. We have constructed new Vietnamese AI-generated datasets to determine the optimal thresholds for VietBinoculars and to enable benchmarking. The results from our experiments show results show that VietBinoculars achieves over 99\% in all two domains of accuracy, F1-score, and AUC on multiple out-of-domain datasets. It outperforms the original Binoculars model, traditional detection methods, and other state-of-the-art approaches, including commercial tools such as ZeroGPT and DetectGPT, especially under specially modified prompting strategies.
