Anchoring Bias in Large Language Models: An Experimental Study
Jiaxu Lou, Yifan Sun
TL;DR
This paper investigates anchoring bias in large language models, using a 62-sample dataset with varied anchor types to quantify how initial hints influence numerical predictions. It demonstrates that strong models are consistently susceptible to anchor hints, especially expert anchors, and finds that common prompting-based mitigations such as Chain-of-Thought, Thoughts of Principles, ignoring anchors, and reflection fail to robustly reduce the bias. A multi-angle information approach (Both-Anchor) shows partial mitigation in some cases, but overall bias persists, highlighting the need for more robust bias-mitigation techniques. The study also contrasts model behavior with human anchoring effects, suggesting practical implications for deploying LLMs in decision-making contexts and offering datasets and prompts to support further research.
Abstract
Large Language Models (LLMs) like GPT-4 and Gemini have significantly advanced artificial intelligence by enabling machines to generate and comprehend human-like text. Despite their impressive capabilities, LLMs are not immune to limitations, including various biases. While much research has explored demographic biases, the cognitive biases in LLMs have not been equally scrutinized. This study delves into anchoring bias, a cognitive bias where initial information disproportionately influences judgment. Utilizing an experimental dataset, we examine how anchoring bias manifests in LLMs and verify the effectiveness of various mitigation strategies. Our findings highlight the sensitivity of LLM responses to biased hints. At the same time, our experiments show that, to mitigate anchoring bias, one needs to collect hints from comprehensive angles to prevent the LLMs from being anchored to individual pieces of information, while simple algorithms such as Chain-of-Thought, Thoughts of Principles, Ignoring Anchor Hints, and Reflection are not sufficient.
