Table of Contents
Fetching ...

LLMs Simulate Big Five Personality Traits: Further Evidence

Aleksandra Sorokovikova, Natalia Fedorova, Sharwin Rezagholi, Ivan P. Yamshchikov

TL;DR

This study investigates whether LLMs simulate Big Five personality traits in their outputs and how stable those simulations are across prompts and generation settings. Using IPIP-NEO-120 to elicit trait scores, the authors compare GPT-4, Llama2, and Mixtral under two prompt variations and multiple temperature settings. The results reveal model-specific trait profiles and partial sensitivity to prompting and temperature, with implications for user experience design and personalized interaction. The work highlights that while LLMs lack agency, their perceived personalities can influence interaction quality, suggesting avenues for targeted prompt design and cautious deployment.

Abstract

An empirical investigation into the simulation of the Big Five personality traits by large language models (LLMs), namely Llama2, GPT4, and Mixtral, is presented. We analyze the personality traits simulated by these models and their stability. This contributes to the broader understanding of the capabilities of LLMs to simulate personality traits and the respective implications for personalized human-computer interaction.

LLMs Simulate Big Five Personality Traits: Further Evidence

TL;DR

This study investigates whether LLMs simulate Big Five personality traits in their outputs and how stable those simulations are across prompts and generation settings. Using IPIP-NEO-120 to elicit trait scores, the authors compare GPT-4, Llama2, and Mixtral under two prompt variations and multiple temperature settings. The results reveal model-specific trait profiles and partial sensitivity to prompting and temperature, with implications for user experience design and personalized interaction. The work highlights that while LLMs lack agency, their perceived personalities can influence interaction quality, suggesting avenues for targeted prompt design and cautious deployment.

Abstract

An empirical investigation into the simulation of the Big Five personality traits by large language models (LLMs), namely Llama2, GPT4, and Mixtral, is presented. We analyze the personality traits simulated by these models and their stability. This contributes to the broader understanding of the capabilities of LLMs to simulate personality traits and the respective implications for personalized human-computer interaction.
Paper Structure (5 sections, 1 figure, 2 tables)

This paper contains 5 sections, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Exemplary Big5 scores (Prompt variation 1, Temperature parameters: 1.5 (GPT4), 0.7 (Llama2), and 0.7 (Mixtral)).