Table of Contents
Fetching ...

LMLPA: Language Model Linguistic Personality Assessment

Jingyao Zheng, Xian Wang, Simo Hosio, Xiaoxian Xu, Lik-Hang Lee

TL;DR

The Language Model Linguistic Personality Assessment (LMLPA) is introduced, a system designed to evaluate the linguistic personalities of LLMs by quantitatively assessing the distinct personality traits reflected in their linguistic outputs.

Abstract

Large Language Models (LLMs) are increasingly used in everyday life and research. One of the most common use cases is conversational interactions, enabled by the language generation capabilities of LLMs. Just as between two humans, a conversation between an LLM-powered entity and a human depends on the personality of the conversants. However, measuring the personality of a given LLM is currently a challenge. This paper introduces the Language Model Linguistic Personality Assessment (LMLPA), a system designed to evaluate the linguistic personalities of LLMs. Our system helps to understand LLMs' language generation capabilities by quantitatively assessing the distinct personality traits reflected in their linguistic outputs. Unlike traditional human-centric psychometrics, the LMLPA adapts a personality assessment questionnaire, specifically the Big Five Inventory, to align with the operational capabilities of LLMs, and also incorporates the findings from previous language-based personality measurement literature. To mitigate sensitivity to the order of options, our questionnaire is designed to be open-ended, resulting in textual answers. Thus, the AI rater is needed to transform ambiguous personality information from text responses into clear numerical indicators of personality traits. Utilising Principal Component Analysis and reliability validations, our findings demonstrate that LLMs possess distinct personality traits that can be effectively quantified by the LMLPA. This research contributes to Human-Computer Interaction and Human-Centered AI, providing a robust framework for future studies to refine AI personality assessments and expand their applications in multiple areas, including education and manufacturing.

LMLPA: Language Model Linguistic Personality Assessment

TL;DR

The Language Model Linguistic Personality Assessment (LMLPA) is introduced, a system designed to evaluate the linguistic personalities of LLMs by quantitatively assessing the distinct personality traits reflected in their linguistic outputs.

Abstract

Large Language Models (LLMs) are increasingly used in everyday life and research. One of the most common use cases is conversational interactions, enabled by the language generation capabilities of LLMs. Just as between two humans, a conversation between an LLM-powered entity and a human depends on the personality of the conversants. However, measuring the personality of a given LLM is currently a challenge. This paper introduces the Language Model Linguistic Personality Assessment (LMLPA), a system designed to evaluate the linguistic personalities of LLMs. Our system helps to understand LLMs' language generation capabilities by quantitatively assessing the distinct personality traits reflected in their linguistic outputs. Unlike traditional human-centric psychometrics, the LMLPA adapts a personality assessment questionnaire, specifically the Big Five Inventory, to align with the operational capabilities of LLMs, and also incorporates the findings from previous language-based personality measurement literature. To mitigate sensitivity to the order of options, our questionnaire is designed to be open-ended, resulting in textual answers. Thus, the AI rater is needed to transform ambiguous personality information from text responses into clear numerical indicators of personality traits. Utilising Principal Component Analysis and reliability validations, our findings demonstrate that LLMs possess distinct personality traits that can be effectively quantified by the LMLPA. This research contributes to Human-Computer Interaction and Human-Centered AI, providing a robust framework for future studies to refine AI personality assessments and expand their applications in multiple areas, including education and manufacturing.

Paper Structure

This paper contains 45 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Results Rated by Llama3-8B-Instruct during the Reverse Experiment
  • Figure 3: (a) Heatmap shows inter-item correlation coefficients among three human raters and three AI models for an experiment assessing the AI's responses to an open-ended questionnaire; (b) Bar plot illustrates Intra-class Correlation Coefficients (ICCs) with 95% Confidence Intervals (CI) for three AI models based on single and average measures.
  • Figure 4: Scatter plots illustrate the effect of reversing the rating scale on the consistency of GPT-4-Turbo's responses to 44 questions. Circles on the plots highlight discrepancies between these conditions, indicating inconsistencies. The left plot, using the BFI, shows 16 inconsistencies with a Cohen's Weighted Kappa of 0.401. The right plot, from our rating system, displays fewer inconsistencies (6 total) with a higher Cohen's Weighted Kappa of 0.877, demonstrating strong agreement and enhanced system reliability.
  • Figure 5: GPT-4-Turbo
  • Figure 7: Ridgeline chart of the distribution of personality scores for GPT-4-Turbo, Mistral-7B-Instruct and Llama3-8B-Instruct, across the BF dimensions in response to various personality instruction prompts. Each plot shows score distributions from ten different persona descriptions per prompt level, with the x-axis illustrating the range of observed scores and the y-axis for each sub-figure representing the frequency or density of these scores, derived from 10 different persona descriptions associated with each prompt.