Table of Contents
Fetching ...

VietBinoculars: A Zero-Shot Approach for Detecting Vietnamese LLM-Generated Text

Trieu Hai Nguyen, Sivaswamy Akilesh

TL;DR

VietBinoculars addresses the challenge of detecting Vietnamese LLM-generated text using a zero-shot extension of the Binoculars framework. It leverages two closely related Vietnamese LLMs, PhoGPT-4B and PhoGPT-4B-Chat, with a shared BPE tokenizer, to compute a robust detector score that normalizes perplexity via cross-perplexity. The approach achieves exceptional out-of-domain performance (often >99% accuracy, F1, and AUC) and outperforms commercial tools and several zero-shot baselines, including under challenging prompting regimes. Its practical impact lies in providing a language-specific, training-free detector that can be deployed with carefully chosen global thresholds, while acknowledging limitations such as domain coverage and computational overhead. The work also demonstrates the Capybara problem scenario where VietBinoculars substantially surpasses other detectors, underscoring the value of language-aware, zero-shot detection for Vietnamese content.

Abstract

The rapid development research of Large Language Models (LLMs) based on transformer architectures raises key challenges, one of them being the task of distinguishing between human-written text and LLM-generated text. As LLM-generated textual content, becomes increasingly complex over time, and resembles human writing, traditional detection methods are proving less effective, especially as the number and diversity of LLMs continue to grow with new models and versions being released at a rapid pace. This study proposes VietBinoculars, an adaptation of the Binoculars method with optimized global thresholds, to enhance the detection of Vietnamese LLM-generated text. We have constructed new Vietnamese AI-generated datasets to determine the optimal thresholds for VietBinoculars and to enable benchmarking. The results from our experiments show results show that VietBinoculars achieves over 99\% in all two domains of accuracy, F1-score, and AUC on multiple out-of-domain datasets. It outperforms the original Binoculars model, traditional detection methods, and other state-of-the-art approaches, including commercial tools such as ZeroGPT and DetectGPT, especially under specially modified prompting strategies.

VietBinoculars: A Zero-Shot Approach for Detecting Vietnamese LLM-Generated Text

TL;DR

VietBinoculars addresses the challenge of detecting Vietnamese LLM-generated text using a zero-shot extension of the Binoculars framework. It leverages two closely related Vietnamese LLMs, PhoGPT-4B and PhoGPT-4B-Chat, with a shared BPE tokenizer, to compute a robust detector score that normalizes perplexity via cross-perplexity. The approach achieves exceptional out-of-domain performance (often >99% accuracy, F1, and AUC) and outperforms commercial tools and several zero-shot baselines, including under challenging prompting regimes. Its practical impact lies in providing a language-specific, training-free detector that can be deployed with carefully chosen global thresholds, while acknowledging limitations such as domain coverage and computational overhead. The work also demonstrates the Capybara problem scenario where VietBinoculars substantially surpasses other detectors, underscoring the value of language-aware, zero-shot detection for Vietnamese content.

Abstract

The rapid development research of Large Language Models (LLMs) based on transformer architectures raises key challenges, one of them being the task of distinguishing between human-written text and LLM-generated text. As LLM-generated textual content, becomes increasingly complex over time, and resembles human writing, traditional detection methods are proving less effective, especially as the number and diversity of LLMs continue to grow with new models and versions being released at a rapid pace. This study proposes VietBinoculars, an adaptation of the Binoculars method with optimized global thresholds, to enhance the detection of Vietnamese LLM-generated text. We have constructed new Vietnamese AI-generated datasets to determine the optimal thresholds for VietBinoculars and to enable benchmarking. The results from our experiments show results show that VietBinoculars achieves over 99\% in all two domains of accuracy, F1-score, and AUC on multiple out-of-domain datasets. It outperforms the original Binoculars model, traditional detection methods, and other state-of-the-art approaches, including commercial tools such as ZeroGPT and DetectGPT, especially under specially modified prompting strategies.

Paper Structure

This paper contains 16 sections, 15 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Illustration of $\log{(\text{perplexity})}$ and $\log{(\text{cross-perplexity})}$. The x-axis represents the average next-token prediction probabilities for the same input string $s$. The solid magenta curve represents $\log{{\text{PPL}}_{\mathcal{M}_1}(s)}$ and the dashed curves represent $\log{{\text{X-PPL}}_{\mathcal{M}_1,\mathcal{M}_2}\left(s\right)}$ for different probability distribution vectors between models $M_1$ and $M_2$. The dashed blue curve illustrates $\log{\text{X-PPL}_{\mathcal{M}_1,\mathcal{M}_2}(s)}$ when the observer and performer models are nearly identical. We assume that the observer model $M_1$ is significantly smaller than the performer model $M_2$ when the two models differ, as depicted by the dash-dotted red curve.
  • Figure 2: Global optimal threshold points represented on the ROC curve, based on VietBinoculars scores for the Sailor2-8B-OptiThreshold-News dataset: (1)--Youden's J point, (2)--Closest point and (3)--Optimal point at the 0.06% FPR. In figure (a), Youden's J and Closest thresholds are marked by a red circle [$\bullet$] and a blue square [$\hbox{$\blacksquare$}$], respectively. The magenta diamond [$\hbox{$\blacklozenge$}$] denotes the TPR@0.06%FPR threshold in figure (b).
  • Figure 3: Confusion Matrix of VietBinoculars on the Sailor2-8B-Validation-News: (a)-- Youden's J threshold, (b)--Closest Point threshold, and (c)--TPR@0.06%FPR threshold.
  • Figure 4: The effect of text length (in tokens) on the detection performance of VietBinoculars and the original Binoculars method on Vietnamese out-of-domain datasets: (a)--Gemma-3-12B-News; (b)--Gemma-3-12B-VuTrongPhung; (c)--Sailor2-8B-VuTrongPhung. The x-axis shows the number of tokens, calculated using the BPE tokenizer. Solid and dashed curves represent Accuracy and F1-score, respectively. Red curves with markers $\blacktriangleright$ and $\times$ correspond to the original Binoculars method, while blue curves with markers $\blacklozenge$ and $\blacksquare$ correspond to VietBinoculars. Youden's J threshold is used for VietBinoculars to determine the maximum detection performance. Binoculars employs Falcon-7B and Falcon-7B-Instruct as the observer model $M_1$ and the performer model $M_2$, respectively.
  • Figure 5: Detection AUROC for various Zero-shot and supervised learning methods on Vietnamese out-of-domain datasets: (a)--Gemma-3-12B-News; (b)--Gemma-3-12B-VuTrongPhung; (c)--Sailor2-8B-VuTrongPhung. Zero-shot detectors are represented with the "\\" hatch pattern, while supervised learning detectors are represented with the "o" hatch pattern. DetectGPT uses top-$k$ and top-$p$ sampling with parameters $k=40$ and $p=0.96$.
  • ...and 3 more figures