Table of Contents
Fetching ...

Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions?

Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, Yanghua Xiao

TL;DR

This work benchmarks persona-driven decision-making in Role-Playing Language Agents using LifeChoice, a dataset of 1,462 decisions from 388 novels annotated by literary experts. It shows that current LLM-based RPLAs can approximate character-driven choices but lag human performance, and introduces CharMap, a persona-based memory retrieval method that improves accuracy by 5.03% over baseline profiles. The study highlights the importance of comprehensive character descriptions and reliable memory retrieval for faithful decision simulation, and discusses data leakage, long-context challenges, and genre/temporal effects. Collectively, LifeChoice and CharMap offer a rigorous framework for evaluating and advancing RPLAs in high-fidelity persona reasoning with implications for personal LLM assistants. Limitations include reliance on fiction and potential biases in expert analyses, pointing to avenues for safer, real-world extensions.

Abstract

Can Large Language Models (LLMs) simulate humans in making important decisions? Recent research has unveiled the potential of using LLMs to develop role-playing language agents (RPLAs), mimicking mainly the knowledge and tones of various characters. However, imitative decision-making necessitates a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,462 characters' decision points from 388 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and RPLA methodologies. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains. Hence, we further propose the CHARMAP method, which adopts persona-based memory retrieval and significantly advances RPLAs on this task, achieving 5.03% increase in accuracy.

Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions?

TL;DR

This work benchmarks persona-driven decision-making in Role-Playing Language Agents using LifeChoice, a dataset of 1,462 decisions from 388 novels annotated by literary experts. It shows that current LLM-based RPLAs can approximate character-driven choices but lag human performance, and introduces CharMap, a persona-based memory retrieval method that improves accuracy by 5.03% over baseline profiles. The study highlights the importance of comprehensive character descriptions and reliable memory retrieval for faithful decision simulation, and discusses data leakage, long-context challenges, and genre/temporal effects. Collectively, LifeChoice and CharMap offer a rigorous framework for evaluating and advancing RPLAs in high-fidelity persona reasoning with implications for personal LLM assistants. Limitations include reliance on fiction and potential biases in expert analyses, pointing to avenues for safer, real-world extensions.

Abstract

Can Large Language Models (LLMs) simulate humans in making important decisions? Recent research has unveiled the potential of using LLMs to develop role-playing language agents (RPLAs), mimicking mainly the knowledge and tones of various characters. However, imitative decision-making necessitates a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,462 characters' decision points from 388 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and RPLA methodologies. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains. Hence, we further propose the CHARMAP method, which adopts persona-based memory retrieval and significantly advances RPLAs on this task, achieving 5.03% increase in accuracy.
Paper Structure (45 sections, 6 figures, 13 tables)

This paper contains 45 sections, 6 figures, 13 tables.

Figures (6)

  • Figure 2: Statistics of motivation types in LifeChoice, with the first words for each motivation type.
  • Figure 3: An overview of CharMap, a two-step scenario-specific character profile building approach.
  • Figure 4: The impact of the number of book reviews on accuracy in LifeChoice, with new books being those not present in the training corpus of LLMs.
  • Figure 5: Heatmap of the impact of motivation types on the results. The results are predicted from the Incremental updating, the embedding-retrieved memory, the direct concatenation of both, and CharMap. The role-playing model uses GPT-4.
  • Figure 6: The result of the impact of different novel genres on accuracy.
  • ...and 1 more figures