Toward a Safer Web: Multilingual Multi-Agent LLMs for Mitigating Adversarial Misinformation Attacks
Nouar Aldahoul, Yasir Zaki
TL;DR
The paper tackles the spread of misinformation under adversarial transformations by proposing a multilingual, multi-agent LLM framework with retrieval-augmented generation (RAG) deployed as a web plugin. It introduces novel attack-style datasets (MCQ, translation, summarization) across English, French, Spanish, Arabic, Hindi, and Chinese, and demonstrates that a RAG-Llama system with multilingual embeddings outperforms vanilla LLMs in detecting false information while preserving true-content recognition. The results show high false-detection accuracy (often >99%) and strong true-information reliability across tasks and languages, with local deployment advantages via embedding models like multilingual-e5-large. The work highlights practical, low-cost test-time enhancements for misinformation detection, while acknowledging limitations such as topic misclassification and the need for up-to-date, secure retrieval databases.
Abstract
The rapid spread of misinformation on digital platforms threatens public discourse, emotional stability, and decision-making. While prior work has explored various adversarial attacks in misinformation detection, the specific transformations examined in this paper have not been systematically studied. In particular, we investigate language-switching across English, French, Spanish, Arabic, Hindi, and Chinese, followed by translation. We also study query length inflation preceding summarization and structural reformatting into multiple-choice questions. In this paper, we present a multilingual, multi-agent large language model framework with retrieval-augmented generation that can be deployed as a web plugin into online platforms. Our work underscores the importance of AI-driven misinformation detection in safeguarding online factual integrity against diverse attacks, while showcasing the feasibility of plugin-based deployment for real-world web applications.
