On Fact and Frequency: LLM Responses to Misinformation Expressed with Uncertainty
Yana van de Sande, Gunes Açar, Thabo van Woudenberg, Martha Larson
TL;DR
The paper investigates how large language models handle misinformation expressed with uncertainty by transforming verified-false propositions into uncertainty-laden statements using a cue-based typology and evaluating GPT-4o, LLaMA3, and DeepSeek-v2. It finds that approximately 25% of cases change from false to not-false after transformation, with doxastic cues driving larger effects, and that frequency estimates elicited from the models significantly predict these judgments. The work shows that neither modality nor most linguistic subcategories explain the changes, but reveals a notable link between perceived frequency and truth assessment, suggesting a potential explanatory signal for LLM fact-check decisions. It contributes an uncertainty typology, a transformed dataset, and a frequency-elicitation paradigm, with implications for designing robust fact-checking tools and understanding how LLMs represent uncertainty.
Abstract
We study LLM judgments of misinformation expressed with uncertainty. Our experiments study the response of three widely used LLMs (GPT-4o, LlaMA3, DeepSeek-v2) to misinformation propositions that have been verified false and then are transformed into uncertain statements according to an uncertainty typology. Our results show that after transformation, LLMs change their factchecking classification from false to not-false in 25% of the cases. Analysis reveals that the change cannot be explained by predictors to which humans are expected to be sensitive, i.e., modality, linguistic cues, or argumentation strategy. The exception is doxastic transformations, which use linguistic cue phrases such as "It is believed ...".To gain further insight, we prompt the LLM to make another judgment about the transformed misinformation statements that is not related to truth value. Specifically, we study LLM estimates of the frequency with which people make the uncertain statement. We find a small but significant correlation between judgment of fact and estimation of frequency.
