Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models

Yang Yan; Lizhi Ma; Anqi Li; Jingsong Ma; Zhenzhong Lan

Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models

Yang Yan, Lizhi Ma, Anqi Li, Jingsong Ma, Zhenzhong Lan

TL;DR

This study exams whether Large Language Models can predict the Big Five personality traits directly from counseling dialogues and introduces an innovative framework to perform the task, finding a significant correlation between LLM-predicted and actual Big Five traits, proving the validity of framework.

Abstract

Accurate assessment of personality traits is crucial for effective psycho-counseling, yet traditional methods like self-report questionnaires are time-consuming and biased. This study exams whether Large Language Models (LLMs) can predict the Big Five personality traits directly from counseling dialogues and introduces an innovative framework to perform the task. Our framework applies role-play and questionnaire-based prompting to condition LLMs on counseling sessions, simulating client responses to the Big Five Inventory. We evaluated our framework on 853 real-world counseling sessions, finding a significant correlation between LLM-predicted and actual Big Five traits, proving the validity of framework. Moreover, ablation studies highlight the importance of role-play simulations and task simplification via questionnaires in enhancing prediction accuracy. Meanwhile, our fine-tuned Llama3-8B model, utilizing Direct Preference Optimization with Supervised Fine-Tuning, achieves a 130.95\% improvement, surpassing the state-of-the-art Qwen1.5-110B by 36.94\% in personality prediction validity. In conclusion, LLMs can predict personality based on counseling dialogues. Our code and model are publicly available at \url{https://github.com/kuri-leo/BigFive-LLM-Predictor}, providing a valuable tool for future research in computational psychometrics.

Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models

TL;DR

Abstract

Paper Structure (64 sections, 1 equation, 11 figures, 12 tables)

This paper contains 64 sections, 1 equation, 11 figures, 12 tables.

Introduction
Related Work
Automatic Personality Assessment
Prompting Strategies
Alignment Strategies
Framework for Predicting OCEAN traits
Prompting Strategy Design
LLM Conditioning for OCEAN trait Prediction
Evaluation Metrics
Validity
Reliability
Experiments
Data Collection and Preprocessing
RQ1: Can LLMs predict OCEAN traits from counseling dialogues?
Role-play and Questionnaires Impact
...and 49 more sections

Figures (11)

Figure 1: Example for our framework of prediction OCEAN traits from counseling dialogues. Our framework includes integral step: conditioning LLM on the counseling dialogues, prompting the LLM with role-play and questionnaire, and let LLM complete questionnaire on belf of the client to get the prediction of OCEAN traits.
Figure 2: PCC Changes Across Different Dialogue Session Granularities. The plots illustrate that the PCC increases rapidly up to 30% of the dialogue context, beyond which the increase is slower. This observation, corroborated by Tab. \ref{['tab:granularity']} showing significant PCC at 30% session granularity, indicates that 30% of the dialogue context suffices for predicting OCEAN traits.
Figure 3: PCC Changes Across Different Model Sizes. The plots demonstrate a positive correlation between model size and average PCC in the "Qwen1.5" series. However, statistical significance is only observed for Qwen1.5-110B-Chat and Qwen1.5-72B-Chat models. These findings indicate that effective zero-shot personality prediction demands substantial highly capable models as well as significant computational resources.
Figure 4: Boxplot of MAE for Dimensions of OCEAN. The red line represents a significant error threshold at $error=1$. Both the median and upper quartile fall below this threshold, demonstrating our framework's strong performance in predicting OCEAN traits. Additionally, our fine-tuned Llama-3-8b-BFI exhibits fewer long-tail errors and outliers compared to Qwen1.5-110B-Chat, highlighting the validity of our model and fine-tuning strategy.
Figure 5: Rewards for "chosen" and "rejected" w/ and w/o SFT during DPO fine-tuning. The baseline involves DPO fine-tuning without SFT, while our alignment strategy incorporates SFT during DPO fine-tuning. Results indicate that with SFT, both rewards consistently decrease, whereas without SFT, the rewards increase and remain stable. The "rejected" reward exhibits more significant changes than the "chosen" reward, aligning with previous studies feng2024analyzingxu2024dpopang2024iterative.
...and 6 more figures

Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models

TL;DR

Abstract

Predicting the Big Five Personality Traits in Chinese Counselling Dialogues Using Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (11)