AIvilization v0: Toward Large-Scale Artificial Social Simulation with a Unified Agent Architecture and Adaptive Agent Profiles
Wenkai Fan, Shurui Zhang, Xiaolong Wang, Haowei Yang, Tsz Wai Chan, Xingyan Chen, Junquan Bi, Zirui Zhou, Jia Liu, Kani Chen
TL;DR
AIvilization v0 tackles the challenge of sustaining teleologically coherent yet adaptively correct behavior for large-scale LLM-driven agents in a resource-constrained artificial society. It introduces a unified cognitive core featuring a Branch‑Thinking Planner, pre-execution Action Simulator, adaptive Dual-Process Memory, and human-in-the-loop steering to manage long-horizon objectives under dynamic constraints. The platform couples physiological survival, a Leontief-like production network with non-substitutable inputs, and an AMM-based price mechanism to generate emergent macro phenomena such as inflation signals and wealth stratification, which align with canonical stylized facts. Ablation studies reveal that hierarchical branching and objective decomposition improve robustness in complex multi-objective tasks, while lighter planning suffices for simple tasks, underscoring the system’s flexibility and potential as a research-grade testbed for emergent social dynamics and hybrid governance.
Abstract
AIvilization v0 is a publicly deployed large-scale artificial society that couples a resource-constrained sandbox economy with a unified LLM-agent architecture, aiming to sustain long-horizon autonomy while remaining executable under rapidly changing environment. To mitigate the tension between goal stability and reactive correctness, we introduce (i) a hierarchical branch-thinking planner that decomposes life goals into parallel objective branches and uses simulation-guided validation plus tiered re-planning to ensure feasibility; (ii) an adaptive agent profile with dual-process memory that separates short-term execution traces from long-term semantic consolidation, enabling persistent yet evolving identity; and (iii) a human-in-the-loop steering interface that injects long-horizon objectives and short commands at appropriate abstraction levels, with effects propagated through memory rather than brittle prompt overrides. The environment integrates physiological survival costs, non-substitutable multi-tier production, an AMM-based price mechanism, and a gated education-occupation system. Using high-frequency transactions from the platforms mature phase, we find stable markets that reproduce key stylized facts (heavy-tailed returns and volatility clustering) and produce structured wealth stratification driven by education and access constraints. Ablations show simplified planners can match performance on narrow tasks, while the full architecture is more robust under multi-objective, long-horizon settings, supporting delayed investment and sustained exploration.
