Table of Contents
Fetching ...

The Power of Personality: A Human Simulation Perspective to Investigate Large Language Model Agents

Yifan Duan, Yihong Tang, Xuefeng Bai, Kehai Chen, Juntao Li, Min Zhang

TL;DR

This study adopts a human psychological simulation framework to systematically examine how Big Five personality traits influence LLM behavior across closed tasks, open-ended tasks, and multi-agent collaboration, using a set of $2^5=32$ trait configurations. By validating trait expression with the BFI-2 scale and evaluating across multiple models (e.g., Qwen-32B, Qwen-14B, Llama-8B) and benchmarks (MMLU, GPQA, TTCT), the authors show that certain traits consistently modulate reasoning accuracy and creativity, while multi-agent configurations yield collective intelligence distinct from single-agent performance. The findings align with some human psychology patterns (e.g., openness enhancing originality) but also reveal model-dependent variability, underscoring the role of model size and architecture. The work suggests personality-driven prompt design as a route to tailoring capabilities and guiding safe, collaborative AI systems in real-world tasks.

Abstract

Large language models (LLMs) excel in both closed tasks (including problem-solving, and code generation) and open tasks (including creative writing), yet existing explanations for their capabilities lack connections to real-world human intelligence. To fill this gap, this paper systematically investigates LLM intelligence through the lens of ``human simulation'', addressing three core questions: (1) \textit{How do personality traits affect problem-solving in closed tasks?} (2) \textit{How do traits shape creativity in open tasks?} (3) \textit{How does single-agent performance influence multi-agent collaboration?} By assigning Big Five personality traits to LLM agents and evaluating their performance in single- and multi-agent settings, we reveal that specific traits significantly influence reasoning accuracy (closed tasks) and creative output (open tasks). Furthermore, multi-agent systems exhibit collective intelligence distinct from individual capabilities, driven by distinguishing combinations of personalities.

The Power of Personality: A Human Simulation Perspective to Investigate Large Language Model Agents

TL;DR

This study adopts a human psychological simulation framework to systematically examine how Big Five personality traits influence LLM behavior across closed tasks, open-ended tasks, and multi-agent collaboration, using a set of trait configurations. By validating trait expression with the BFI-2 scale and evaluating across multiple models (e.g., Qwen-32B, Qwen-14B, Llama-8B) and benchmarks (MMLU, GPQA, TTCT), the authors show that certain traits consistently modulate reasoning accuracy and creativity, while multi-agent configurations yield collective intelligence distinct from single-agent performance. The findings align with some human psychology patterns (e.g., openness enhancing originality) but also reveal model-dependent variability, underscoring the role of model size and architecture. The work suggests personality-driven prompt design as a route to tailoring capabilities and guiding safe, collaborative AI systems in real-world tasks.

Abstract

Large language models (LLMs) excel in both closed tasks (including problem-solving, and code generation) and open tasks (including creative writing), yet existing explanations for their capabilities lack connections to real-world human intelligence. To fill this gap, this paper systematically investigates LLM intelligence through the lens of ``human simulation'', addressing three core questions: (1) \textit{How do personality traits affect problem-solving in closed tasks?} (2) \textit{How do traits shape creativity in open tasks?} (3) \textit{How does single-agent performance influence multi-agent collaboration?} By assigning Big Five personality traits to LLM agents and evaluating their performance in single- and multi-agent settings, we reveal that specific traits significantly influence reasoning accuracy (closed tasks) and creative output (open tasks). Furthermore, multi-agent systems exhibit collective intelligence distinct from individual capabilities, driven by distinguishing combinations of personalities.

Paper Structure

This paper contains 36 sections, 1 equation, 3 figures, 22 tables.

Figures (3)

  • Figure 1: Illustration of the overall framework: First, we set prompts based on psychological scales to guide the agent in simulating different Big Five personality traits. Then, we use the BFI-2 scale to test whether the agent can accurately exhibit the designated personality traits. Next, we test agents with different personality traits in both closed and open tasks to explore the impact of simulating different personality traits on agent performance. In addition, we form teams composed of agents with different personality traits and test these teams in both closed and open tasks to study the effect of simulated personality traits on team effectiveness.
  • Figure 2: Correlation Analysis between Personality Traits and Closed Tasks: Pearson Correlation Coefficients and Statistical Significance (p-values) between Five Personality Dimensions and the Accuracy of Five Closed Tasks.
  • Figure 3: Correlation Analysis between Personality Traits and Closed Tasks: Pearson Correlation Coefficients and Statistical Significance (p-values) between Five Personality Dimensions and the Accuracy of Five Closed Tasks.