Why is "Chicago" Predictive of Deceptive Reviews? Using LLMs to Discover Language Phenomena from Lexical Cues
Jiaming Qu, Mengtian Guo, Yue Wang
TL;DR
This paper tackles the interpretability gap in deception-detection cues by using large language models to translate predictive lexical cues into human-understandable language phenomena. It introduces a conjecture-then-validate pipeline that prompts LLMs to verbalize phenomena associated with predictive words identified from Chicago hotel reviews, and then algorithmically validates these conjectures. Across three research questions, the study shows that the conjectured phenomena can be predictive and generalizable to new domains, particularly when derived from the predictive words, and that this approach can offer transferable linguistic insights beyond prior LLM knowledge or in-context learning. The work suggests a practical path to help users assess credibility in online reviews even when algorithmic filters are unavailable, while also outlining limitations and directions for iterative prompting and expert evaluation, supported by related human studies.
Abstract
Deceptive reviews mislead consumers, harm businesses, and undermine trust in online marketplaces. Machine learning classifiers can learn from large amounts of training examples to effectively distinguish deceptive reviews from genuine ones. However, the distinguishing features learned by these classifiers are often subtle, fragmented, and difficult for humans to interpret. In this work, we explore using large language models (LLMs) to translate machine-learned lexical cues into human-understandable language phenomena that can differentiate deceptive reviews from genuine ones. We show that language phenomena obtained in this manner are empirically grounded in data, generalizable across similar domains, and more predictive than phenomena either in LLMs' prior knowledge or obtained through in-context learning. These language phenomena have the potential to aid people in critically assessing the credibility of online reviews in environments where deception detection classifiers are unavailable.
