Table of Contents
Fetching ...

PsyPlay: Personality-Infused Role-Playing Conversational Agents

Tao Yang, Yuhua Zhu, Xiaojun Quan, Cong Liu, Qifan Wang

TL;DR

PsyPlay tackles the gap in RPCAs by enabling personality-infused role-playing with Big Five traits through a three-stage framework: Role Card Creation, Topic Extraction, and Dialogue Generation. The approach combines trait-driven role cards, topic grounding from real-world data, and AutoGen-based dialogue to produce multi-turn interactions where agents consistently reflect designated personalities. Automated back-testing with GPT-3.5 shows an overall success rate of 80.31%, with higher fidelity for positive traits, and a large PsyPlay-Bench corpus (4745 validated dialogues) to support future research in personalized dialogue and personality detection. The findings also reveal how trait levels, dialogue turns, and model alignment influence portrayal fidelity, offering practical insights for building more realistic and controllable personality-aware agents. The PsyPlay framework and PsyPlay-Bench provide valuable resources for developing, evaluating, and benchmarking personality-infused RPCAs in real-world settings.

Abstract

The current research on Role-Playing Conversational Agents (RPCAs) with Large Language Models (LLMs) primarily focuses on imitating specific speaking styles and utilizing character backgrounds, neglecting the depiction of deeper personality traits.~In this study, we introduce personality-infused role-playing for LLM agents, which encourages agents to accurately portray their designated personality traits during dialogues. We then propose PsyPlay, a dialogue generation framework that facilitates the expression of rich personalities among multiple LLM agents. Specifically, PsyPlay enables agents to assume roles with distinct personality traits and engage in discussions centered around specific topics, consistently exhibiting their designated personality traits throughout the interactions. Validation on generated dialogue data demonstrates that PsyPlay can accurately portray the intended personality traits, achieving an overall success rate of 80.31% on GPT-3.5. Notably, we observe that LLMs aligned with positive values are more successful in portraying positive personality roles compared to negative ones. Moreover, we construct a dialogue corpus for personality-infused role-playing, called PsyPlay-Bench. The corpus, which consists of 4745 instances of correctly portrayed dialogues using PsyPlay, aims to further facilitate research in personalized role-playing and dialogue personality detection.

PsyPlay: Personality-Infused Role-Playing Conversational Agents

TL;DR

PsyPlay tackles the gap in RPCAs by enabling personality-infused role-playing with Big Five traits through a three-stage framework: Role Card Creation, Topic Extraction, and Dialogue Generation. The approach combines trait-driven role cards, topic grounding from real-world data, and AutoGen-based dialogue to produce multi-turn interactions where agents consistently reflect designated personalities. Automated back-testing with GPT-3.5 shows an overall success rate of 80.31%, with higher fidelity for positive traits, and a large PsyPlay-Bench corpus (4745 validated dialogues) to support future research in personalized dialogue and personality detection. The findings also reveal how trait levels, dialogue turns, and model alignment influence portrayal fidelity, offering practical insights for building more realistic and controllable personality-aware agents. The PsyPlay framework and PsyPlay-Bench provide valuable resources for developing, evaluating, and benchmarking personality-infused RPCAs in real-world settings.

Abstract

The current research on Role-Playing Conversational Agents (RPCAs) with Large Language Models (LLMs) primarily focuses on imitating specific speaking styles and utilizing character backgrounds, neglecting the depiction of deeper personality traits.~In this study, we introduce personality-infused role-playing for LLM agents, which encourages agents to accurately portray their designated personality traits during dialogues. We then propose PsyPlay, a dialogue generation framework that facilitates the expression of rich personalities among multiple LLM agents. Specifically, PsyPlay enables agents to assume roles with distinct personality traits and engage in discussions centered around specific topics, consistently exhibiting their designated personality traits throughout the interactions. Validation on generated dialogue data demonstrates that PsyPlay can accurately portray the intended personality traits, achieving an overall success rate of 80.31% on GPT-3.5. Notably, we observe that LLMs aligned with positive values are more successful in portraying positive personality roles compared to negative ones. Moreover, we construct a dialogue corpus for personality-infused role-playing, called PsyPlay-Bench. The corpus, which consists of 4745 instances of correctly portrayed dialogues using PsyPlay, aims to further facilitate research in personalized role-playing and dialogue personality detection.

Paper Structure

This paper contains 36 sections, 5 figures, 11 tables.

Figures (5)

  • Figure 1: An illustration of a dialogue that encapsulates the distinctive personalities of two agents. Agent A, who exhibits high levels of conscientiousness and agreeableness, typically maintains an optimistic perspective and a strong inclination towards empathy. Conversely, Agent B, who is characterized by low levels of openness and agreeableness, tends to display a pessimistic attitude and a resistance towards embracing new experiences.
  • Figure 2: Illustration of the proposed PsyPlay through three stages: Role Card Creation, Topic Extraction, and Dialogue Generation. The first stage aims to create multiple personalized roles. The second stage extracts appropriate dialogue topics for roles. The third stage prompts the roles to engage in conversation with each other based on the given topic, resulting in personality-infused dialogues.
  • Figure 3: Results of the study on personality levels. The lower-level "a bit" exhibits poor rate, while the higher-levels "very" and "extremely" show superior rates.
  • Figure 4: Results of the study on dialogue turns. The results suggest that the positive dimensions and the negative dimensions show the opposite trend.
  • Figure 5: Diversity analysis of portrayed traits. Rows and columns represent predefined and back-tested personalities, respectively. Each score refers the percentage of that personality can be detected from the dialogue.