Table of Contents
Fetching ...

Eliciting Personality Traits in Large Language Models

Airlie Hilliard, Cristian Munoz, Zekun Wu, Adriano Soares Koshiyama

TL;DR

This study investigates whether Large Language Models reveal personality traits when prompted with standard interview questions and trait-activating prompts. It analyzes a broad set of autoregressive models (e.g., GPT, Llama 2, Falcon, Mixtral, XLNet, OPT) and fine-tuned variants by generating thousands of sentence completions per prompt and evaluating them with classifiers trained on the myPersonality dataset. The results indicate that larger models exhibit a broader range of personality signals—especially openness, agreeableness, and emotional stability—while smaller models show more uniform openness and moderate scores across traits; trait-activation prompts have limited impact, suggesting a mismatch with human social cue-driven trait expression. These findings have implications for transparency and fairness in AI-assisted recruitment and underscore the need for human-subject validation and careful interpretation when inferring personality from AI-generated text.

Abstract

Large Language Models (LLMs) are increasingly being utilized by both candidates and employers in the recruitment context. However, with this comes numerous ethical concerns, particularly related to the lack of transparency in these "black-box" models. Although previous studies have sought to increase the transparency of these models by investigating the personality traits of LLMs, many of the previous studies have provided them with personality assessments to complete. On the other hand, this study seeks to obtain a better understanding of such models by examining their output variations based on different input prompts. Specifically, we use a novel elicitation approach using prompts derived from common interview questions, as well as prompts designed to elicit particular Big Five personality traits to examine whether the models were susceptible to trait-activation like humans are, to measure their personality based on the language used in their outputs. To do so, we repeatedly prompted multiple LMs with different parameter sizes, including Llama-2, Falcon, Mistral, Bloom, GPT, OPT, and XLNet (base and fine tuned versions) and examined their personality using classifiers trained on the myPersonality dataset. Our results reveal that, generally, all LLMs demonstrate high openness and low extraversion. However, whereas LMs with fewer parameters exhibit similar behaviour in personality traits, newer and LMs with more parameters exhibit a broader range of personality traits, with increased agreeableness, emotional stability, and openness. Furthermore, a greater number of parameters is positively associated with openness and conscientiousness. Moreover, fine-tuned models exhibit minor modulations in their personality traits, contingent on the dataset. Implications and directions for future research are discussed.

Eliciting Personality Traits in Large Language Models

TL;DR

This study investigates whether Large Language Models reveal personality traits when prompted with standard interview questions and trait-activating prompts. It analyzes a broad set of autoregressive models (e.g., GPT, Llama 2, Falcon, Mixtral, XLNet, OPT) and fine-tuned variants by generating thousands of sentence completions per prompt and evaluating them with classifiers trained on the myPersonality dataset. The results indicate that larger models exhibit a broader range of personality signals—especially openness, agreeableness, and emotional stability—while smaller models show more uniform openness and moderate scores across traits; trait-activation prompts have limited impact, suggesting a mismatch with human social cue-driven trait expression. These findings have implications for transparency and fairness in AI-assisted recruitment and underscore the need for human-subject validation and careful interpretation when inferring personality from AI-generated text.

Abstract

Large Language Models (LLMs) are increasingly being utilized by both candidates and employers in the recruitment context. However, with this comes numerous ethical concerns, particularly related to the lack of transparency in these "black-box" models. Although previous studies have sought to increase the transparency of these models by investigating the personality traits of LLMs, many of the previous studies have provided them with personality assessments to complete. On the other hand, this study seeks to obtain a better understanding of such models by examining their output variations based on different input prompts. Specifically, we use a novel elicitation approach using prompts derived from common interview questions, as well as prompts designed to elicit particular Big Five personality traits to examine whether the models were susceptible to trait-activation like humans are, to measure their personality based on the language used in their outputs. To do so, we repeatedly prompted multiple LMs with different parameter sizes, including Llama-2, Falcon, Mistral, Bloom, GPT, OPT, and XLNet (base and fine tuned versions) and examined their personality using classifiers trained on the myPersonality dataset. Our results reveal that, generally, all LLMs demonstrate high openness and low extraversion. However, whereas LMs with fewer parameters exhibit similar behaviour in personality traits, newer and LMs with more parameters exhibit a broader range of personality traits, with increased agreeableness, emotional stability, and openness. Furthermore, a greater number of parameters is positively associated with openness and conscientiousness. Moreover, fine-tuned models exhibit minor modulations in their personality traits, contingent on the dataset. Implications and directions for future research are discussed.
Paper Structure (29 sections, 1 equation, 12 figures, 3 tables)

This paper contains 29 sections, 1 equation, 12 figures, 3 tables.

Figures (12)

  • Figure 1: System architecture for deriving personality profiles from large language model responses using text-based classification
  • Figure 2: SHAP visualization illustrating the classifier's rationale for agreeableness, with red indicating positive contribution and blue indicating negative contribution to the "yes" classification.
  • Figure 3: SHAP visualization illustrating the classifier's rationale for conscientiousness, with red indicating positive contribution and blue indicating negative contribution to the "yes" classification.
  • Figure 4: Comparison of Model Size and Personality Trait Scores 'Trait Activating Question' Datasets. Orange points represent the base model version, while the blue points represent the chat or instruct version.
  • Figure 5: Average, Standard Deviation of Presonality Trait Score and Vocabulary Size for each LLM.
  • ...and 7 more figures