Human Trust in AI Search: A Large-Scale Experiment
Haiwen Li, Sinan Aral
TL;DR
This study investigates how GenAI search shapes human trust and behavior, addressing the trust gap created by generative search designs. It combines a global exposure assessment with a preregistered US randomized experiment to test trust cues, including references, uncertainty highlighting, social feedback, and explanations, using a 5-item trust metric and willingness-to-share as primary outcomes. Analyzing two regression specifications, the authors find that GenAI generally lowers trust compared with traditional search, but trust can be manipulated upward by providing references (even when invalid) and responses with explicit social signals; uncertainty highlighting consistently reduces trust. The results highlight demographic and topical heterogeneity in susceptibility to GenAI misrepresentations and show that trust translates into behavior (more clicks, less evaluation), underscoring the need for careful GenAI interface design to mitigate the trust gap and promote safer information seeking.
Abstract
Large Language Models (LLMs) increasingly power generative search engines which, in turn, drive human information seeking and decision making at scale. The extent to which humans trust generative artificial intelligence (GenAI) can therefore influence what we buy, how we vote and our health. Unfortunately, no work establishes the causal effect of generative search designs on human trust. Here we execute ~12,000 search queries across seven countries, generating ~80,000 real-time GenAI and traditional search results, to understand the extent of current global exposure to GenAI search. We then use a preregistered, randomized experiment on a large study sample representative of the U.S. population to show that while participants trust GenAI search less than traditional search on average, reference links and citations significantly increase trust in GenAI, even when those links and citations are incorrect or hallucinated. Uncertainty highlighting, which reveals GenAI's confidence in its own conclusions, makes us less willing to trust and share generative information whether that confidence is high or low. Positive social feedback increases trust in GenAI while negative feedback reduces trust. These results imply that GenAI designs can increase trust in inaccurate and hallucinated information and reduce trust when GenAI's certainty is made explicit. Trust in GenAI varies by topic and with users' demographics, education, industry employment and GenAI experience, revealing which sub-populations are most vulnerable to GenAI misrepresentations. Trust, in turn, predicts behavior, as those who trust GenAI more click more and spend less time evaluating GenAI search results. These findings suggest directions for GenAI design to safely and productively address the AI "trust gap."
