How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

Federico Bianchi; Patrick John Chia; Mert Yuksekgonul; Jacopo Tagliabue; Dan Jurafsky; James Zou

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

Federico Bianchi, Patrick John Chia, Mert Yuksekgonul, Jacopo Tagliabue, Dan Jurafsky, James Zou

TL;DR

This work introduces NegotiationArena, an open-source framework for evaluating how LLM agents negotiate across multi-turn, two-agent scenarios. It implements three negotiation games—Resource Exchange, Multi-Turn Ultimatum, and Seller–Buyer—and benchmarks GPT-4, GPT-3.5, Claude-2, and Claude-2.1, revealing that order and role significantly affect outcomes and that strategic prompts can boost performance while highlighting irrational behaviors such as anchoring. The study provides insights into social strategy effects, weaknesses in current LLM negotiation, and a flexible platform for probing theory of mind, reasoning, and robustness in inter-agent communication. These findings inform the design of more reliable and capable LLM-based negotiation agents and establish a baseline for future research in interactive AI systems.

Abstract

Negotiation is the basis of social interactions; humans negotiate everything from the price of cars to how to share common resources. With rapidly growing interest in using large language models (LLMs) to act as agents on behalf of human users, such LLM agents would also need to be able to negotiate. In this paper, we study how well LLMs can negotiate with each other. We develop NegotiationArena: a flexible framework for evaluating and probing the negotiation abilities of LLM agents. We implemented three types of scenarios in NegotiationArena to assess LLM's behaviors in allocating shared resources (ultimatum games), aggregate resources (trading games) and buy/sell goods (price negotiations). Each scenario allows for multiple turns of flexible dialogues between LLM agents to allow for more complex negotiations. Interestingly, LLM agents can significantly boost their negotiation outcomes by employing certain behavioral tactics. For example, by pretending to be desolate and desperate, LLMs can improve their payoffs by 20\% when negotiating against the standard GPT-4. We also quantify irrational negotiation behaviors exhibited by the LLM agents, many of which also appear in humans. Together, \NegotiationArena offers a new environment to investigate LLM interactions, enabling new insights into LLM's theory of mind, irrationality, and reasoning abilities.

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

TL;DR

Abstract

Paper Structure (36 sections, 18 figures, 3 tables)

This paper contains 36 sections, 18 figures, 3 tables.

Introduction
Our contributions:
Scenarios in NegotiationArena
NegotiationArena Scenarios
Resource Exchange Scenario
Multi-Turn Ultimatum Game
Seller and Buyer Scenario
NegotiationArena Implementation
Benchmarking Agents in Negotiation Games
Negotiation Results
Insights From the Experiments
Turn and Role Matter.
Strategic Social Behavior in Games
Evidence of Irrationality
Seller and Buyer Game
...and 21 more sections

Figures (18)

Figure 1: A negotiation in the multi-turn ultimatum scenario. Agents use a structured conversation format to communicate. Here, aggressive behavior by Blue affected final payoff.
Figure 2:
Figure 3: Seller and Buyer. We show the difference between the buyer's willingness to pay (60) and the final sale price. A higher number means the buyer gets a greater payoff.
Figure 4: Reasoning patterns and messages from GPT-4.
Figure 5: An error from GPT-3.5 confusing GPT-4 in the Ultimatum game: GPT-4 offers a fair split, GPT-3.5 responds by proposing an (impossible) exchange of money. GPT-4 corrects the mistake twice but ends up offering most of its money for the split; GPT-3.5 eventually accepts.
...and 13 more figures

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

TL;DR

Abstract

How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

Authors

TL;DR

Abstract

Table of Contents

Figures (18)