Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions?
Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, Yanghua Xiao
TL;DR
This work benchmarks persona-driven decision-making in Role-Playing Language Agents using LifeChoice, a dataset of 1,462 decisions from 388 novels annotated by literary experts. It shows that current LLM-based RPLAs can approximate character-driven choices but lag human performance, and introduces CharMap, a persona-based memory retrieval method that improves accuracy by 5.03% over baseline profiles. The study highlights the importance of comprehensive character descriptions and reliable memory retrieval for faithful decision simulation, and discusses data leakage, long-context challenges, and genre/temporal effects. Collectively, LifeChoice and CharMap offer a rigorous framework for evaluating and advancing RPLAs in high-fidelity persona reasoning with implications for personal LLM assistants. Limitations include reliance on fiction and potential biases in expert analyses, pointing to avenues for safer, real-world extensions.
Abstract
Can Large Language Models (LLMs) simulate humans in making important decisions? Recent research has unveiled the potential of using LLMs to develop role-playing language agents (RPLAs), mimicking mainly the knowledge and tones of various characters. However, imitative decision-making necessitates a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,462 characters' decision points from 388 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and RPLA methodologies. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains. Hence, we further propose the CHARMAP method, which adopts persona-based memory retrieval and significantly advances RPLAs on this task, achieving 5.03% increase in accuracy.
