Domain-Independent Deception: A New Taxonomy and Linguistic Analysis
Rakesh M. Verma, Nachum Dershowitz, Victor Zeng, Dainis Boumber, Xuting Liu
TL;DR
This paper tackles deception detection across multiple domains by introducing a formal, domain-agnostic framework. It defines deception with a probabilistic, exposure-based criterion and presents a multi-dimensional taxonomy that includes agents, stratagems, goals, and exposure, along with dimensions like motivation and modality. Through linguistic cue analysis across real-world deception datasets and extensive deep-learning experiments with transformer models, the authors find both universal and domain-specific signals, highlighting substantial cross-domain transfer potential but also limitations in generalization without diverse, domain-spanning data. The work also provides guidelines for rigorous systematic reviews and critiques past claims of universal linguistic cues, aiming to stabilize the field and pave the way for robust, domain-independent deception detectors.
Abstract
Internet-based economies and societies are drowning in deceptive attacks. These attacks take many forms, such as fake news, phishing, and job scams, which we call ``domains of deception.'' Machine-learning and natural-language-processing researchers have been attempting to ameliorate this precarious situation by designing domain-specific detectors. Only a few recent works have considered domain-independent deception. We collect these disparate threads of research and investigate domain-independent deception. First, we provide a new computational definition of deception and break down deception into a new taxonomy. Then, we analyze the debate on linguistic cues for deception and supply guidelines for systematic reviews. Finally, we investigate common linguistic features and give evidence for knowledge transfer across different forms of deception.
