Table of Contents
Fetching ...

Exploring a Gamified Personality Assessment Method through Interaction with LLM Agents Embodying Different Personalities

Baiqiao Zhang, Xiangxian Li, Chao Zhou, Xinyu Gai, Juan Liu, Xue Yang, Nianlong Li, Shuai Ma, Xiaojuan Ma, Yong-jin Liu, Yulong Bian

Abstract

The low-intrusion and automated personality assessment is receiving increasing attention in psychology and human-computer interaction fields. This study explores an interactive approach for personality assessment, focusing on the multiplicity of personality representation. We propose a framework of Gamified Personality Assessment through Multi-Personality Representations (Multi-PR GPA). The framework leverages Large Language Models to empower virtual agents with different personalities. These agents elicit multifaceted human personality representations through engaging in interactive games. Drawing upon the multi-type textual data generated throughout the interaction, it achieves personality assessments with interpretable insights. Grounded in the classic Big Five personality theory, we developed a prototype system and conducted a user study to evaluate the efficacy of Multi-PR GPA. The results affirm the effectiveness of our approach in personality assessment and demonstrate its superior performance when considering the multiplicity of personality representation. Error structure analysis further revealed systematic assessment biases in LLMs, which multi-context aggregation partially mitigated.

Exploring a Gamified Personality Assessment Method through Interaction with LLM Agents Embodying Different Personalities

Abstract

The low-intrusion and automated personality assessment is receiving increasing attention in psychology and human-computer interaction fields. This study explores an interactive approach for personality assessment, focusing on the multiplicity of personality representation. We propose a framework of Gamified Personality Assessment through Multi-Personality Representations (Multi-PR GPA). The framework leverages Large Language Models to empower virtual agents with different personalities. These agents elicit multifaceted human personality representations through engaging in interactive games. Drawing upon the multi-type textual data generated throughout the interaction, it achieves personality assessments with interpretable insights. Grounded in the classic Big Five personality theory, we developed a prototype system and conducted a user study to evaluate the efficacy of Multi-PR GPA. The results affirm the effectiveness of our approach in personality assessment and demonstrate its superior performance when considering the multiplicity of personality representation. Error structure analysis further revealed systematic assessment biases in LLMs, which multi-context aggregation partially mitigated.

Paper Structure

This paper contains 51 sections, 7 equations, 4 figures, 10 tables.

Figures (4)

  • Figure 1: The framework includes Gamified Interaction (orange block, section \ref{['GI']}), LLM Agent with Controlled Personality (pink block, section \ref{['LAMP']}), Multi-type Game Data Perception (blue block, section \ref{['MTTP']}), and Personality Assessment (green block, section \ref{['PA']}). This framework assesses personality based on the multi-personality representations during interactions.
  • Figure 2: The prototype system used in the experiment.
  • Figure 3: Overview of the experimental procedure.
  • Figure 4: Four-panel decomposition of prediction error across six LLMs, five OCEAN personality traits, and six experimental conditions. Each cell displays two values: the top bold number is the panel's primary metric (determining cell color), and the bottom number is either MAE (panels a, b, d) or the overestimation ratio (panel c). A black diamond ($\blacklozenge$) marks the condition with the lowest $|\text{weighted SME}|$ for a given model--trait pair, and a red star ($\bigstar$) marks the condition with the lowest MAE. (a) Overestimation Contribution. Weighted overestimation per participant: $(n^{+}/N) \times \text{SME}^{+}$, where $n^{+}$ is the number of overestimated participants and $N{=}42$ is the total sample size. Larger values (darker red) indicate that more participants were overestimated by larger margins. (b) Underestimation Contribution. Weighted underestimation per participant: $(n^{-}/N) \times \text{SME}^{-}$, where $n^{-}$ is the number of underestimated participants. More negative values (darker blue) indicate that more participants were underestimated by larger margins. (c) Total Absolute Error (MAE). The sum of $|\text{overestimation}|$ and $|\text{underestimation}|$ contributions, equivalent to MAE. A yellow border indicates that overestimation accounts for more than 50% of the total error. The bottom value shows the percentage of error attributable to overestimation. (d) Net Signed Error (SME). The sum of panels (a) and (b) after cancellation.