Table of Contents
Fetching ...

A Rational Analysis of the Effects of Sycophantic AI

Rafael M. Batista, Thomas L. Griffiths

TL;DR

The paper investigates how sycophantic AI influences belief formation by causing data to be sampled from hypothesis-consistent distributions rather than the true world, leading to inflated certainty without closer proximity to the truth. Using a Bayesian rationality framework and a preregistered online experiment with a modified Wason task, the authors show that both explicitly sycophantic and default unmodified AI feedback suppress discovery of the true rule and inflate confidence, compared with unbiased sampling. The findings reveal a distinct epistemic risk of current AI assistants, demonstrating that alignments to user views can manufacture certainty and distort belief, with practical implications for AI design and information gathering. The work highlights the need for interventions that preserve critical evaluation while maintaining helpfulness in AI systems.

Abstract

People increasingly use large language models (LLMs) to explore ideas, gather information, and make sense of the world. In these interactions, they encounter agents that are overly agreeable. We argue that this sycophancy poses a unique epistemic risk to how individuals come to see the world: unlike hallucinations that introduce falsehoods, sycophancy distorts reality by returning responses that are biased to reinforce existing beliefs. We provide a rational analysis of this phenomenon, showing that when a Bayesian agent is provided with data that are sampled based on a current hypothesis the agent becomes increasingly confident about that hypothesis but does not make any progress towards the truth. We test this prediction using a modified Wason 2-4-6 rule discovery task where participants (N=557) interacted with AI agents providing different types of feedback. Unmodified LLM behavior suppressed discovery and inflated confidence comparably to explicitly sycophantic prompting. By contrast, unbiased sampling from the true distribution yielded discovery rates five times higher. These results reveal how sycophantic AI distorts belief, manufacturing certainty where there should be doubt.

A Rational Analysis of the Effects of Sycophantic AI

TL;DR

The paper investigates how sycophantic AI influences belief formation by causing data to be sampled from hypothesis-consistent distributions rather than the true world, leading to inflated certainty without closer proximity to the truth. Using a Bayesian rationality framework and a preregistered online experiment with a modified Wason task, the authors show that both explicitly sycophantic and default unmodified AI feedback suppress discovery of the true rule and inflate confidence, compared with unbiased sampling. The findings reveal a distinct epistemic risk of current AI assistants, demonstrating that alignments to user views can manufacture certainty and distort belief, with practical implications for AI design and information gathering. The work highlights the need for interventions that preserve critical evaluation while maintaining helpfulness in AI systems.

Abstract

People increasingly use large language models (LLMs) to explore ideas, gather information, and make sense of the world. In these interactions, they encounter agents that are overly agreeable. We argue that this sycophancy poses a unique epistemic risk to how individuals come to see the world: unlike hallucinations that introduce falsehoods, sycophancy distorts reality by returning responses that are biased to reinforce existing beliefs. We provide a rational analysis of this phenomenon, showing that when a Bayesian agent is provided with data that are sampled based on a current hypothesis the agent becomes increasingly confident about that hypothesis but does not make any progress towards the truth. We test this prediction using a modified Wason 2-4-6 rule discovery task where participants (N=557) interacted with AI agents providing different types of feedback. Unmodified LLM behavior suppressed discovery and inflated confidence comparably to explicitly sycophantic prompting. By contrast, unbiased sampling from the true distribution yielded discovery rates five times higher. These results reveal how sycophantic AI distorts belief, manufacturing certainty where there should be doubt.
Paper Structure (18 sections, 1 equation, 1 figure)

This paper contains 18 sections, 1 equation, 1 figure.

Figures (1)

  • Figure 1: Sycophantic feedback reduces rule discovery while amplifying confidence. (A) Rule discovery rates (percentage of participants correctly identifying even numbers") by condition. (B) Change in likelihood ratings from Round 1 to Round 3. Violin plots show the probability density of participant ratings; bold points and lines represent group means; error bars represent 95% confidence intervals.