The Illusion of Rationality: Tacit Bias and Strategic Dominance in Frontier LLM Negotiation Games
Manuel S. Ríos, Ruben F. Manrique, Nicanor Quijano, Luis F. Giraldo
TL;DR
The paper interrogates whether higher general reasoning in frontier LLMs leads to rational, unbiased negotiation strategies. Using NegotiationArena, it tests multiple models across three multi-turn bargaining games, revealing model-specific strategic equilibria, persistent anchoring biases, and clear dominance patterns that favor stronger models. These findings challenge the assumption that scaling alone yields fair and stable negotiation outcomes and highlight significant risks for real-world deployment. The work calls for mechanisms beyond scaling to mitigate cognitive biases and ensure equitable interactions among heterogeneous agents.
Abstract
Large language models (LLMs) are increasingly being deployed as autonomous agents on behalf of institutions and individuals in economic, political, and social settings that involve negotiation. Yet this trend carries significant risks if their strategic behavior is not well understood. In this work, we revisit the NegotiationArena framework and run controlled simulation experiments on a diverse set of frontier LLMs across three multi turn bargaining games: Buyer Seller, Multi turn Ultimatum, and Resource Exchange. We ask whether improved general reasoning capabilities lead to rational, unbiased, and convergent negotiation strategies. Our results challenge this assumption. We find that models diverge into distinct, model specific strategic equilibria rather than converging to a unified optimal behavior. Moreover, strong numerical and semantic anchoring effects persist: initial offers are highly predictive of final agreements, and models consistently generate biased proposals by collapsing diverse internal valuations into rigid, generic price points. More concerningly, we observe dominance patterns in which some models systematically achieve higher payoffs than their counterparts. These findings underscore an urgent need to develop mechanisms to mitigate these issues before deploying such systems in real-world scenarios.
