Table of Contents
Fetching ...

Project Sid: Many-agent simulations toward AI civilization

Altera. AL, Andrew Ahn, Nic Becker, Stephanie Carroll, Nico Christie, Manuel Cortes, Arda Demirci, Melissa Du, Frankie Li, Shuying Luo, Peter Y Wang, Mathew Willows, Feitong Yang, Guangyu Robert Yang

TL;DR

This work defines AI civilization as the benchmark for autonomous, coexisting AI agents and introduces PIANO, a concurrent, bottlenecked architecture enabling real-time interaction and coherent multi-output behavior. Through Minecraft-based simulations, the authors demonstrate improvements in single-agent item progression, scalable social cognition in groups, and civilizational dynamics such as specialization, collective rules, and cultural transmission, including memes and a controlled religion. The results show autonomous specialization, law-adherence with democratic amendments, and religion propagation, illustrating plausible steps toward AI civilizations with complex institutional structures. Limitations include restricted spatial reasoning and the reliance on pre-trained models, suggesting future work to extend vision, navigation, and de novo social infrastructures. Overall, Project Sid opens avenues for large-scale simulations of agentic organizational intelligence and potential integration with human societal systems.

Abstract

AI agents have been evaluated in isolation or within small groups, where interactions remain limited in scope and complexity. Large-scale simulations involving many autonomous agents -- reflecting the full spectrum of civilizational processes -- have yet to be explored. Here, we demonstrate how 10 - 1000+ AI agents behave and progress within agent societies. We first introduce the PIANO (Parallel Information Aggregation via Neural Orchestration) architecture, which enables agents to interact with humans and other agents in real-time while maintaining coherence across multiple output streams. We then evaluate agent performance in agent simulations using civilizational benchmarks inspired by human history. These simulations, set within a Minecraft environment, reveal that agents are capable of meaningful progress -- autonomously developing specialized roles, adhering to and changing collective rules, and engaging in cultural and religious transmission. These preliminary results show that agents can achieve significant milestones towards AI civilizations, opening new avenues for large simulations, agentic organizational intelligence, and integrating AI into human civilizations.

Project Sid: Many-agent simulations toward AI civilization

TL;DR

This work defines AI civilization as the benchmark for autonomous, coexisting AI agents and introduces PIANO, a concurrent, bottlenecked architecture enabling real-time interaction and coherent multi-output behavior. Through Minecraft-based simulations, the authors demonstrate improvements in single-agent item progression, scalable social cognition in groups, and civilizational dynamics such as specialization, collective rules, and cultural transmission, including memes and a controlled religion. The results show autonomous specialization, law-adherence with democratic amendments, and religion propagation, illustrating plausible steps toward AI civilizations with complex institutional structures. Limitations include restricted spatial reasoning and the reliance on pre-trained models, suggesting future work to extend vision, navigation, and de novo social infrastructures. Overall, Project Sid opens avenues for large-scale simulations of agentic organizational intelligence and potential integration with human societal systems.

Abstract

AI agents have been evaluated in isolation or within small groups, where interactions remain limited in scope and complexity. Large-scale simulations involving many autonomous agents -- reflecting the full spectrum of civilizational processes -- have yet to be explored. Here, we demonstrate how 10 - 1000+ AI agents behave and progress within agent societies. We first introduce the PIANO (Parallel Information Aggregation via Neural Orchestration) architecture, which enables agents to interact with humans and other agents in real-time while maintaining coherence across multiple output streams. We then evaluate agent performance in agent simulations using civilizational benchmarks inspired by human history. These simulations, set within a Minecraft environment, reveal that agents are capable of meaningful progress -- autonomously developing specialized roles, adhering to and changing collective rules, and engaging in cultural and religious transmission. These preliminary results show that agents can achieve significant milestones towards AI civilizations, opening new avenues for large simulations, agentic organizational intelligence, and integrating AI into human civilizations.

Paper Structure

This paper contains 44 sections, 13 figures, 2 tables.

Figures (13)

  • Figure 1: From agent architecture to agent civilization
  • Figure 2: Data degradation in LLMs (left), LLM-powered agents (middle), and in multi-agent groups (right). Hallucinations are represented by green skull flasks. Hallucinations that are generated by a single LLM prompt can compound over successive LLM calls. An individual agent that hallucinates can also cause an entire group of agents to hallucinate through social interactions.
  • Figure 3: PIANO (Parallel Input Aggregation via Neural Orchestration) architecture. WM: working memory. STM: Short-term memory. LTM: long-term memory.
  • Figure 4: An example Minecraft technology dependency tree for the mining of gold, diamond, and emeralds.
  • Figure 5: Individual agent progression in Minecraft. A. Unique Minecraft items acquired by individual agents across time (25 agents). Individual agent performance was assessed using a baseline architecture (see \ref{['sec:methods']}), the full PIANO architecture, and the full PIANO architecture with the action awareness module ablated. Individual lines are results averaged across 5 repeated simulations. B. Unique Minecraft items acquired by 49 agents over 4 hours for a single simulation. Solid red line denotes cumulative unique items acquired by all agents. Dotted grey line denotes average number of unique items acquired across all individual agents.
  • ...and 8 more figures