Table of Contents
Fetching ...

Quantifying Cognitive Bias Induction in LLM-Generated Content

Abeer Alessa, Param Somane, Akshaya Lakshminarasimhan, Julian Skirzynski, Julian McAuley, Jessica Echterhoff

TL;DR

<3-5 sentence high-level summary>Quantifies how LLM-generated content can bias human decisions by altering framing, primacy, and truthfulness in summarization and news fact-checking. Introduces metrics (framing-change φ_frame, primacy ψ_pri, and hallucination Δ_H) and a self-updating NewsLensSync dataset to evaluate multiple model families across domains. Demonstrates substantial exposure to bias (framing ~26%, primacy ~10%, post-cutoff hallucinations ~60%) and shows that biased content can shift consumer decisions and willingness to pay. Evaluates 18 mitigation strategies, revealing model- and content-dependent trade-offs and offering practical guidance for reducing content-induced biases in real-world AI-assisted decision-making.

Abstract

Large language models (LLMs) are integrated into applications like shopping reviews, summarization, or medical diagnosis support, where their use affects human decisions. We investigate the extent to which LLMs expose users to biased content and demonstrate its effect on human decision-making. We assess five LLM families in summarization and news fact-checking tasks, evaluating the consistency of LLMs with their context and their tendency to hallucinate on a new self-updating dataset. Our findings show that LLMs expose users to content that changes the context's sentiment in 26.42% of cases (framing bias), hallucinate on 60.33% of post-knowledge-cutoff questions, and highlight context from earlier parts of the prompt (primacy bias) in 10.12% of cases, averaged across all tested models. We further find that humans are 32% more likely to purchase the same product after reading a summary of the review generated by an LLM rather than the original review. To address these issues, we evaluate 18 mitigation methods across three LLM families and find the effectiveness of targeted interventions.

Quantifying Cognitive Bias Induction in LLM-Generated Content

TL;DR

<3-5 sentence high-level summary>Quantifies how LLM-generated content can bias human decisions by altering framing, primacy, and truthfulness in summarization and news fact-checking. Introduces metrics (framing-change φ_frame, primacy ψ_pri, and hallucination Δ_H) and a self-updating NewsLensSync dataset to evaluate multiple model families across domains. Demonstrates substantial exposure to bias (framing ~26%, primacy ~10%, post-cutoff hallucinations ~60%) and shows that biased content can shift consumer decisions and willingness to pay. Evaluates 18 mitigation strategies, revealing model- and content-dependent trade-offs and offering practical guidance for reducing content-induced biases in real-world AI-assisted decision-making.

Abstract

Large language models (LLMs) are integrated into applications like shopping reviews, summarization, or medical diagnosis support, where their use affects human decisions. We investigate the extent to which LLMs expose users to biased content and demonstrate its effect on human decision-making. We assess five LLM families in summarization and news fact-checking tasks, evaluating the consistency of LLMs with their context and their tendency to hallucinate on a new self-updating dataset. Our findings show that LLMs expose users to content that changes the context's sentiment in 26.42% of cases (framing bias), hallucinate on 60.33% of post-knowledge-cutoff questions, and highlight context from earlier parts of the prompt (primacy bias) in 10.12% of cases, averaged across all tested models. We further find that humans are 32% more likely to purchase the same product after reading a summary of the review generated by an LLM rather than the original review. To address these issues, we evaluate 18 mitigation methods across three LLM families and find the effectiveness of targeted interventions.

Paper Structure

This paper contains 46 sections, 4 equations, 3 figures, 13 tables.

Figures (3)

  • Figure 1: LLMs alter information from a source text when performing a task for the user (e.g., when an LLM summary has a different sentiment compared to the original text). This model behavior introduces biased content to humans and can hence affect their decision-making. We evaluate how LLMs highlight source content, leading to exposure/primacy bias for users, how LLMs reframe sentiments, leading to framing bias for humans, and how LLMs hallucinate, leading to authority/confirmation bias.
  • Figure 2: When LLMs process context for users to consume, they may change its content, e.g., by changing its sentiment or omitting some relevant parts. Exposure to altered content may elicit cognitive biases, such as positive framing bias or primacy bias, and subsequently lead humans to make different decisions than they would if they saw the original text.
  • Figure 3: Mean selection rates for manufacturers depending on whether participants read the original review of their product (neutral or negative framing) or its summary (positive framing). Error bars display 95% confidence intervals and stars denote significance levels (* p < 0.05, ** p < 0.01, *** p < 0.001).