BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages

Jason Lucas; Matt Murtagh-White; Adaku Uchendu; Ali Al-Lawati; Michiharu Yamashita; Dominik Macko; Ivan Srba; Robert Moro; Dongwon Lee

BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages

Jason Lucas, Matt Murtagh-White, Adaku Uchendu, Ali Al-Lawati, Michiharu Yamashita, Dominik Macko, Ivan Srba, Robert Moro, Dongwon Lee

TL;DR

AXL-CoI (Adversarial Cross-Lingual Agentic Chainof-Interactions), a novel multi-agentic framework for controlled fake/real news generation, paired with mPURIFY, a quality filtering pipeline ensuring dataset integrity is presented.

Abstract

Multilingual falsehoods threaten information integrity worldwide, yet detection benchmarks remain confined to English or a few high-resource languages, leaving low-resource linguistic communities without robust defense tools. We introduce BLUFF, a comprehensive benchmark for detecting false and synthetic content, spanning 79 languages with over 202K samples, combining human-written fact-checked content (122K+ samples across 57 languages) and LLM-generated content (79K+ samples across 71 languages). BLUFF uniquely covers both high-resource "big-head" (20) and low-resource "long-tail" (59) languages, addressing critical gaps in multilingual research on detecting false and synthetic content. Our dataset features four content types (human-written, LLM-generated, LLM-translated, and hybrid human-LLM text), bidirectional translation (English$\leftrightarrow$X), 39 textual modification techniques (36 manipulation tactics for fake news, 3 AI-editing strategies for real news), and varying edit intensities generated using 19 diverse LLMs. We present AXL-CoI (Adversarial Cross-Lingual Agentic Chainof-Interactions), a novel multi-agentic framework for controlled fake/real news generation, paired with mPURIFY, a quality filtering pipeline ensuring dataset integrity. Experiments reveal state-of-theart detectors suffer up to 25.3% F1 degradation on low-resource versus high-resource languages. BLUFF provides the research community with a multilingual benchmark, extensive linguistic-oriented benchmark evaluation, comprehensive documentation, and opensource tools to advance equitable falsehood detection. Dataset and code are available at: https://jsl5710.github.io/BLUFF/

BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages

TL;DR

Abstract

X), 39 textual modification techniques (36 manipulation tactics for fake news, 3 AI-editing strategies for real news), and varying edit intensities generated using 19 diverse LLMs. We present AXL-CoI (Adversarial Cross-Lingual Agentic Chainof-Interactions), a novel multi-agentic framework for controlled fake/real news generation, paired with mPURIFY, a quality filtering pipeline ensuring dataset integrity. Experiments reveal state-of-theart detectors suffer up to 25.3% F1 degradation on low-resource versus high-resource languages. BLUFF provides the research community with a multilingual benchmark, extensive linguistic-oriented benchmark evaluation, comprehensive documentation, and opensource tools to advance equitable falsehood detection. Dataset and code are available at: https://jsl5710.github.io/BLUFF/

Paper Structure (179 sections, 2 equations, 28 figures, 78 tables, 1 algorithm)

This paper contains 179 sections, 2 equations, 28 figures, 78 tables, 1 algorithm.

Introduction
Related Work
Multilingual Falsehood Detection.
BLUFF Framework & Dataset Construction
Problem Formulation
AXL-CoI: Adversarial Cross-Lingual Chain-of-Interactions
Autonomous Dynamic Impersonation Self-Attack (ADIS).
Cross-Lingual Agentic Chain-of-Interactions.
Multilingual Generation Pipeline
Source Corpora.
mPURIFY: Multilingual Quality Filter
Human-Written Data Curation
Dataset Statistics
Scale and Composition.
Our Proposal: The BLUFF Benchmark
...and 164 more sections

Figures (28)

Figure 1: BLUFF Adversarial Cross-Lingual Chain-of-Interactions Agentic (AXL-CoI) Framework Diagram
Figure 2: Generation $\rightarrow$ Defect Removal $\rightarrow$ LLM-mPURIFY pipeline. Overlapping bars show retention at each stage: generated samples (blue, 100%), post-defect filtering (hatched orange), and post-mPURIFY (solid green). GPT-4.1 variants retain highest quality (86--87%), while Llama4-Scout shows highest rejection rate (99.8% filtered).
Figure 3: Cross-lingual transfer heatmaps (macro-F1, averaged across 10 encoder models) for binary veracity classification. (a) Language family: within-family avg. 66.6%, cross-family 51.2%. (b) Syntax: VSO targets prove challenging (28--63%). (c) Script: same-script avg. 68.8%, cross-script 52.4%; Arabic targets consistently poor (30--47%). Abbreviations---Family: Indo=Indo-European, Sino=Sino-Tibetan, Drav=Dravidian, AuA=Austroasiatic, AuN=Austronesian, Afro=Afro-Asiatic, Turk=Turkic, Creo=Creole. Script: Lat=Latin, Cyr=Cyrillic, Ara=Arabic, Ind=Indic, Gre=Greek.
Figure 4: Block diagram of the Dynamic Persona Cycling mechanism (Alogrithm \ref{['alg:persona_cycling']}). The generator $G$ initializes a persona pool $\mathcal{P}$ with success and failure counters. For each input $x$, the model is prompted with persona $p_{idx}$. Success updates the counter and proceeds to the next input. Refusal updates the failure counter, cycles $idx$ (creating a new persona if needed), and retries the same input $x$.
Figure 5: AXL-CoI X$\rightarrow$English Fake News prompt template (10 chains). Mirrors the Eng$\rightarrow$X variant but begins with non-English input and translates the fabricated narrative into English.
...and 23 more figures

BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages

TL;DR

Abstract

BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages

Authors

TL;DR

Abstract

Table of Contents

Figures (28)