Table of Contents
Fetching ...

SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users

Xinnong Zhang, Jiayu Lin, Xinyi Mou, Shiyue Yang, Xiawei Liu, Libo Sun, Hanjia Lyu, Yihang Yang, Weihong Qi, Yue Chen, Guanying Li, Ling Yan, Yao Hu, Siming Chen, Yu Wang, Xuanjing Huang, Jiebo Luo, Shiping Tang, Libo Wu, Baohua Zhou, Zhongyu Wei

TL;DR

SocioVerse presents a modular, LLM-agent–driven world model for large-scale social simulation, anchored by a 10-million-user pool and four alignment modules that connect the environment, users, scenarios, and behaviors. The framework is validated across three domains—politics, media, and economics—showing its ability to reproduce population dynamics with diversity and credibility, while revealing model-specific biases and domain-dependent performance. Ablation studies demonstrate that incorporating prior real-world distributions and historical user content improves accuracy, underscoring the value of rich demographic data and context. By standardizing pipelines and enabling scalable, representational simulations, SocioVerse offers a practical platform for social science researchers to explore policy impacts, public opinion, and economic behaviors at scale.

Abstract

Social simulation is transforming traditional social science research by modeling human behavior through interactions between virtual individuals and their environments. With recent advances in large language models (LLMs), this approach has shown growing potential in capturing individual differences and predicting group behaviors. However, existing methods face alignment challenges related to the environment, target users, interaction mechanisms, and behavioral patterns. To this end, we introduce SocioVerse, an LLM-agent-driven world model for social simulation. Our framework features four powerful alignment components and a user pool of 10 million real individuals. To validate its effectiveness, we conducted large-scale simulation experiments across three distinct domains: politics, news, and economics. Results demonstrate that SocioVerse can reflect large-scale population dynamics while ensuring diversity, credibility, and representativeness through standardized procedures and minimal manual adjustments.

SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users

TL;DR

SocioVerse presents a modular, LLM-agent–driven world model for large-scale social simulation, anchored by a 10-million-user pool and four alignment modules that connect the environment, users, scenarios, and behaviors. The framework is validated across three domains—politics, media, and economics—showing its ability to reproduce population dynamics with diversity and credibility, while revealing model-specific biases and domain-dependent performance. Ablation studies demonstrate that incorporating prior real-world distributions and historical user content improves accuracy, underscoring the value of rich demographic data and context. By standardizing pipelines and enabling scalable, representational simulations, SocioVerse offers a practical platform for social science researchers to explore policy impacts, public opinion, and economic behaviors at scale.

Abstract

Social simulation is transforming traditional social science research by modeling human behavior through interactions between virtual individuals and their environments. With recent advances in large language models (LLMs), this approach has shown growing potential in capturing individual differences and predicting group behaviors. However, existing methods face alignment challenges related to the environment, target users, interaction mechanisms, and behavioral patterns. To this end, we introduce SocioVerse, an LLM-agent-driven world model for social simulation. Our framework features four powerful alignment components and a user pool of 10 million real individuals. To validate its effectiveness, we conducted large-scale simulation experiments across three distinct domains: politics, news, and economics. Results demonstrate that SocioVerse can reflect large-scale population dynamics while ensuring diversity, credibility, and representativeness through standardized procedures and minimal manual adjustments.

Paper Structure

This paper contains 54 sections, 6 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: An illustration of the SocioVerse in the case of Ukraine issue. The alignment challenges are well handled regarding environment, user, scenario, and behavior.
  • Figure 2: An illustration of SocioVerse framework invovling 4 powerful parts. The social environment provides an updated context for the simulation. During the simulation, the behavior engine takes the simulation setting, user profiles, and social information from the scenario engine, user engine, and social environment, respectively, and generates the results according to the query.
  • Figure 3: Illustration of three scenarios representing (a) presidential election prediction, (b) breaking news feedback, and (c) national economic survey.
  • Figure 4: An illustration of the performances of the breaking news feedback simulation, where PC, PR, PB, TR, FA, and PA denote six dimensions from the Likert scale (see §\ref{['subsec:news']} questionnaire design), with 1-point standing for totally disagree and 5-point for totally agree.
  • Figure 5: Demographic distribution on X and Rednote user pool.