Users Favor LLM-Generated Content -- Until They Know It's AI
Petr Parshakov, Iuliia Naidenova, Sofia Paklina, Nikita Matkin, Cornel Nesseler
TL;DR
The paper investigates how awareness of content provenance (human vs. AI) influences judgments of responses to popular questions. It employs a controlled field experiment with 846 participants evaluating 25 questions derived from Quora and Stack Overflow, using four LLMs and a human, under aware and unaware conditions. A logistic regression framework shows that knowing the origin increases the likelihood of selecting human-generated content, with heterogeneity by gender, programming skills, and the question domain; overall, AI-generated responses are favored when origin is concealed, signaling a provenance bias. The results have practical implications for deploying AI in contexts that require high trust and quality assessments, and suggest that transparency about content origin can modulate user trust and acceptance. $Pr(human_i = 1) = \beta_0 + \beta_1 \text{aware}_i + \beta_2 \text{female}_i + \beta_3 \text{age}_i + \beta_4 \text{duration}_i + \beta_5 \text{education\_level}_i + \beta_6 \text{education\_field}_i + \beta_7 \text{programming\_skills}_i + \beta_8 \text{model}_i + \beta_9 \text{question\_field}_i + \epsilon_i$.
Abstract
In this paper, we investigate how individuals evaluate human and large langue models generated responses to popular questions when the source of the content is either concealed or disclosed. Through a controlled field experiment, participants were presented with a set of questions, each accompanied by a response generated by either a human or an AI. In a randomized design, half of the participants were informed of the response's origin while the other half remained unaware. Our findings indicate that, overall, participants tend to prefer AI-generated responses. However, when the AI origin is revealed, this preference diminishes significantly, suggesting that evaluative judgments are influenced by the disclosure of the response's provenance rather than solely by its quality. These results underscore a bias against AI-generated content, highlighting the societal challenge of improving the perception of AI work in contexts where quality assessments should be paramount.
