Table of Contents
Fetching ...

Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?

Yue Huang, Zhengqing Yuan, Yujun Zhou, Kehan Guo, Xiangqi Wang, Haomin Zhuang, Weixiang Sun, Lichao Sun, Jindong Wang, Yanfang Ye, Xiangliang Zhang

TL;DR

This paper interrogates the reliability of LLM-based social simulations by introducing the TrustSim dataset, which spans 10 CSS domains with 740 carefully crafted, persona-driven evaluation instances. It shows that while many LLMs perform well in role-based tasks, inconsistencies and misalignments persist, and general performance does not reliably predict simulation reliability. To address this, the authors propose AdaORPO, an adaptive learning rate-based ORPO algorithm that jointly fine-tunes and aligns outputs using judge-based feedback, demonstrating improved reliability across multiple open-weight models. The work provides a foundation for more robust and trustworthy LLM-driven social simulations and highlights important directions for future research and ethical considerations.

Abstract

Large Language Models (LLMs) are increasingly employed for simulations, enabling applications in role-playing agents and Computational Social Science (CSS). However, the reliability of these simulations is under-explored, which raises concerns about the trustworthiness of LLMs in these applications. In this paper, we aim to answer ``How reliable is LLM-based simulation?'' To address this, we introduce TrustSim, an evaluation dataset covering 10 CSS-related topics, to systematically investigate the reliability of the LLM simulation. We conducted experiments on 14 LLMs and found that inconsistencies persist in the LLM-based simulated roles. In addition, the consistency level of LLMs does not strongly correlate with their general performance. To enhance the reliability of LLMs in simulation, we proposed Adaptive Learning Rate Based ORPO (AdaORPO), a reinforcement learning-based algorithm to improve the reliability in simulation across 7 LLMs. Our research provides a foundation for future studies to explore more robust and trustworthy LLM-based simulations.

Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?

TL;DR

This paper interrogates the reliability of LLM-based social simulations by introducing the TrustSim dataset, which spans 10 CSS domains with 740 carefully crafted, persona-driven evaluation instances. It shows that while many LLMs perform well in role-based tasks, inconsistencies and misalignments persist, and general performance does not reliably predict simulation reliability. To address this, the authors propose AdaORPO, an adaptive learning rate-based ORPO algorithm that jointly fine-tunes and aligns outputs using judge-based feedback, demonstrating improved reliability across multiple open-weight models. The work provides a foundation for more robust and trustworthy LLM-driven social simulations and highlights important directions for future research and ethical considerations.

Abstract

Large Language Models (LLMs) are increasingly employed for simulations, enabling applications in role-playing agents and Computational Social Science (CSS). However, the reliability of these simulations is under-explored, which raises concerns about the trustworthiness of LLMs in these applications. In this paper, we aim to answer ``How reliable is LLM-based simulation?'' To address this, we introduce TrustSim, an evaluation dataset covering 10 CSS-related topics, to systematically investigate the reliability of the LLM simulation. We conducted experiments on 14 LLMs and found that inconsistencies persist in the LLM-based simulated roles. In addition, the consistency level of LLMs does not strongly correlate with their general performance. To enhance the reliability of LLMs in simulation, we proposed Adaptive Learning Rate Based ORPO (AdaORPO), a reinforcement learning-based algorithm to improve the reliability in simulation across 7 LLMs. Our research provides a foundation for future studies to explore more robust and trustworthy LLM-based simulations.

Paper Structure

This paper contains 18 sections, 6 equations, 17 figures, 6 tables, 1 algorithm.

Figures (17)

  • Figure 1: An example of cognitive inconsistency in simulation: expected fifth-grade response vs. unexpected advanced calculus solution.
  • Figure 2: A data example in TrustSim. Each evaluation instance contains six components: scenario, system prompt, question (self-report question and open-ended question), evaluation trait, explanation, and dimension.
  • Figure 3: The distribution of evaluation instances across different subjects (left) and the distribution of the number of words in different kinds of questions (right). SR: Self-Report, OE: Open-Ended.
  • Figure 4: The pipeline of dataset construction.
  • Figure 5: An example of an unrelated self-report question and open-ended question.
  • ...and 12 more figures