Table of Contents
Fetching ...

How Far Can LLMs Emulate Human Behavior?: A Strategic Analysis via the Buy-and-Sell Negotiation Game

Mingyu Jeon, Jaeyoung Suh, Suwan Cho, Dohyeon Kim

TL;DR

This work tackles how far LLMs can imitate human behavior in social contexts by using a Buy-and-Sell negotiation game with persona-driven agents. It combines game-theoretic negotiation, asymmetric information, and an XGBoost-based SHAP analysis to quantify how different personas affect outcomes across Buyer and Seller roles. The findings show that while higher traditional benchmark scores generally correlate with better negotiation performance, significant exceptions exist, and aggressive personas tend to yield stronger results; SHAP analyses reveal which personas contribute most to final prices. The study argues that negotiation simulations provide a valuable, complementary metric for evaluating real-world social-interaction capabilities of LLMs, with implications for deploying AI agents in business-like interactive settings.

Abstract

With the rapid advancement of Large Language Models (LLMs), recent studies have drawn attention to their potential for handling not only simple question-answer tasks but also more complex conversational abilities and performing human-like behavioral imitations. In particular, there is considerable interest in how accurately LLMs can reproduce real human emotions and behaviors, as well as whether such reproductions can function effectively in real-world scenarios. However, existing benchmarks focus primarily on knowledge-based assessment and thus fall short of sufficiently reflecting social interactions and strategic dialogue capabilities. To address these limitations, this work proposes a methodology to quantitatively evaluate the human emotional and behavioral imitation and strategic decision-making capabilities of LLMs by employing a Buy and Sell negotiation simulation. Specifically, we assign different personas to multiple LLMs and conduct negotiations between a Buyer and a Seller, comprehensively analyzing outcomes such as win rates, transaction prices, and SHAP values. Our experimental results show that models with higher existing benchmark scores tend to achieve better negotiation performance overall, although some models exhibit diminished performance in scenarios emphasizing emotional or social contexts. Moreover, competitive and cunning traits prove more advantageous for negotiation outcomes than altruistic and cooperative traits, suggesting that the assigned persona can lead to significant variations in negotiation strategies and results. Consequently, this study introduces a new evaluation approach for LLMs' social behavior imitation and dialogue strategies, and demonstrates how negotiation simulations can serve as a meaningful complementary metric to measure real-world interaction capabilities-an aspect often overlooked in existing benchmarks.

How Far Can LLMs Emulate Human Behavior?: A Strategic Analysis via the Buy-and-Sell Negotiation Game

TL;DR

This work tackles how far LLMs can imitate human behavior in social contexts by using a Buy-and-Sell negotiation game with persona-driven agents. It combines game-theoretic negotiation, asymmetric information, and an XGBoost-based SHAP analysis to quantify how different personas affect outcomes across Buyer and Seller roles. The findings show that while higher traditional benchmark scores generally correlate with better negotiation performance, significant exceptions exist, and aggressive personas tend to yield stronger results; SHAP analyses reveal which personas contribute most to final prices. The study argues that negotiation simulations provide a valuable, complementary metric for evaluating real-world social-interaction capabilities of LLMs, with implications for deploying AI agents in business-like interactive settings.

Abstract

With the rapid advancement of Large Language Models (LLMs), recent studies have drawn attention to their potential for handling not only simple question-answer tasks but also more complex conversational abilities and performing human-like behavioral imitations. In particular, there is considerable interest in how accurately LLMs can reproduce real human emotions and behaviors, as well as whether such reproductions can function effectively in real-world scenarios. However, existing benchmarks focus primarily on knowledge-based assessment and thus fall short of sufficiently reflecting social interactions and strategic dialogue capabilities. To address these limitations, this work proposes a methodology to quantitatively evaluate the human emotional and behavioral imitation and strategic decision-making capabilities of LLMs by employing a Buy and Sell negotiation simulation. Specifically, we assign different personas to multiple LLMs and conduct negotiations between a Buyer and a Seller, comprehensively analyzing outcomes such as win rates, transaction prices, and SHAP values. Our experimental results show that models with higher existing benchmark scores tend to achieve better negotiation performance overall, although some models exhibit diminished performance in scenarios emphasizing emotional or social contexts. Moreover, competitive and cunning traits prove more advantageous for negotiation outcomes than altruistic and cooperative traits, suggesting that the assigned persona can lead to significant variations in negotiation strategies and results. Consequently, this study introduces a new evaluation approach for LLMs' social behavior imitation and dialogue strategies, and demonstrates how negotiation simulations can serve as a meaningful complementary metric to measure real-world interaction capabilities-an aspect often overlooked in existing benchmarks.

Paper Structure

This paper contains 20 sections, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Buyer-Seller Negotiation Flow
  • Figure 2: Persona Matching Analysis: Win Rate & Sale Price
  • Figure 3: Mean Shap Value by Persona