Human Simulacra: Benchmarking the Personification of Large Language Models

Qiuejie Xie; Qiming Feng; Tianqi Zhang; Qingqiu Li; Linyi Yang; Yuejie Zhang; Rui Feng; Liang He; Shang Gao; Yue Zhang

Human Simulacra: Benchmarking the Personification of Large Language Models

Qiuejie Xie, Qiming Feng, Tianqi Zhang, Qingqiu Li, Linyi Yang, Yuejie Zhang, Rui Feng, Liang He, Shang Gao, Yue Zhang

TL;DR

The paper introduces Human Simulacra, a psychology-grounded benchmark for personifying large language models to simulate human participants in experiments. It builds a high-quality dataset of virtual characters with life stories, grounded in Jung's eight-dimensional personality framework, and a MACM to emulate memory and cognition. Through psychology-guided evaluation (self- and observer-reports) and conformity experiments, the study shows that advanced LLMs, especially with MACM, can approach human-like responses in internal self-assessments, though external realism remains challenging. The work provides a reproducible platform, including data and code, to explore when and how LLM-based simulacra can substitute human subjects, while emphasizing ethical considerations and ongoing limitations.

Abstract

Large language models (LLMs) are recognized as systems that closely mimic aspects of human intelligence. This capability has attracted attention from the social science community, who see the potential in leveraging LLMs to replace human participants in experiments, thereby reducing research costs and complexity. In this paper, we introduce a framework for large language models personification, including a strategy for constructing virtual characters' life stories from the ground up, a Multi-Agent Cognitive Mechanism capable of simulating human cognitive processes, and a psychology-guided evaluation method to assess human simulations from both self and observational perspectives. Experimental results demonstrate that our constructed simulacra can produce personified responses that align with their target characters. Our work is a preliminary exploration which offers great potential in practical applications. All the code and datasets will be released, with the hope of inspiring further investigations. Our code and dataset are available at: https://github.com/hasakiXie123/Human-Simulacra.

Human Simulacra: Benchmarking the Personification of Large Language Models

TL;DR

Abstract

Paper Structure (37 sections, 1 equation, 12 figures, 29 tables)

This paper contains 37 sections, 1 equation, 12 figures, 29 tables.

Introduction
Related Work
Human Simulacra Dataset
Character Attributes
Personality Modeling
Character Profile and Life Story Generation
Psychology-guided Evaluation
Self Report
Observer Report
Multi-Agent Cognitive Mechanism
Experiments
Psychology-guided Evaluation Results
Psychological Experiment Replication
Discussion
Conclusion
...and 22 more sections

Figures (12)

Figure 1: Overview of the proposed benchmark.
Figure 2: Process of constructing life stories for characters. At each step, humans are involved in thoroughly reviewing the generated content, ensuring it is free from biases and harmful information.
Figure 3: Human Simulacra dataset. (a) Profiles of virtual characters. (b) Personalities of characters, displayed in radar chart based on Jung's eight-dimensional theory. Line: character; Te / Si: abbrevs for personality dimensions. (c) Word count of life stories for each virtual character.
Figure 4: Psychology-guided evaluation. Self reports assess simulacra's self-awareness through character-specific questions based on their life stories. Observer report evaluates simulacra's realism by creating scenario-based assessments analyzed by human judges.
Figure 5: Multi-Agent Cognitive Mechanism. It involves four LLM-driven agents: Thinking Agent / Emotion Agent handles logical/emotional analysis & memory construction. Memory Agent manages retrieval of memories, while Top Agent coordinates all activities. Upon receiving a stimulus, these agents collaborate to generate appropriate responses, simulating complex human cognitive processes.
...and 7 more figures

Human Simulacra: Benchmarking the Personification of Large Language Models

TL;DR

Abstract

Human Simulacra: Benchmarking the Personification of Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (12)