Unmask It! AI-Generated Product Review Detection in Dravidian Languages

Somsubhra De; Advait Vats

Unmask It! AI-Generated Product Review Detection in Dravidian Languages

Somsubhra De, Advait Vats

TL;DR

This work addresses the detection of AI-generated product reviews in low-resource Dravidian languages by evaluating a broad spectrum of methods from traditional ML to state-of-the-art transformers (e.g., IndicSBERT, MuRIL, XLM-RoBERTa, Malayalam-BERT). Through Tamil and Malayalam datasets, the study demonstrates that transformer-based approaches substantially outperform traditional methods and DL architectures, with IndicSBERT excelling for Tamil and Malayalam-BERT for Malayalam, while some exceptional transformer runs show near-perfect precision/recall. Qualitative analyses reveal language-specific patterns in AI vs human reviews and highlight the value of human-in-the-loop insights. The findings emphasize the practical potential of transformer-based detectors to improve trust in e-commerce platforms for under-resourced languages, while outlining future work on LLMs, ensemble strategies, larger diverse corpora, and ethical considerations. Overall, the paper advances AI-generated content detection in Dravidian languages and provides actionable guidance for deploying robust detection in real-world, multilingual marketplaces.

Abstract

The rise of Generative AI has led to a surge in AI-generated reviews, often posing a serious threat to the credibility of online platforms. Reviews serve as the primary source of information about products and services. Authentic reviews play a vital role in consumer decision-making. The presence of fabricated content misleads consumers, undermines trust and facilitates potential fraud in digital marketplaces. This study focuses on detecting AI-generated product reviews in Tamil and Malayalam, two low-resource languages where research in this domain is relatively under-explored. We worked on a range of approaches - from traditional machine learning methods to advanced transformer-based models such as Indic-BERT, IndicSBERT, MuRIL, XLM-RoBERTa and MalayalamBERT. Our findings highlight the effectiveness of leveraging the state-of-the-art transformers in accurately identifying AI-generated content, demonstrating the potential in enhancing the detection of fake reviews in low-resource language settings.

Unmask It! AI-Generated Product Review Detection in Dravidian Languages

TL;DR

Abstract

Unmask It! AI-Generated Product Review Detection in Dravidian Languages

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)