Intelligence at the Edge of Chaos
Shiyang Zhang, Aakash Patel, Syed A Rizvi, Nianchen Liu, Sizhuang He, Amin Karbasi, Emanuele Zappala, David van Dijk
TL;DR
This work investigates whether intelligence in large language models can emerge from exposure to complex, rule-based data rather than human-intelligent data. By pretraining GPT-2 variants on sequences generated by elementary cellular automata across Wolfram complexity classes and evaluating downstream tasks (ARC-inspired reasoning, Nim, and chess move prediction), the authors reveal a positive link between data complexity and downstream performance, peaking near the edge of chaos (Class IV) where data are structured yet challenging to predict. Attention analyses show that models trained on more complex data rely on longer temporal histories, suggesting the development of nontrivial, transferable representations rather than trivial rule-following. The study highlights implications for data-centric AI development, offering a framework to harness complexity for emergent capabilities and providing reproducible pipelines for future exploration.
Abstract
We explore the emergence of intelligent behavior in artificial systems by investigating how the complexity of rule-based systems influences the capabilities of models trained to predict these rules. Our study focuses on elementary cellular automata (ECA), simple yet powerful one-dimensional systems that generate behaviors ranging from trivial to highly complex. By training distinct Large Language Models (LLMs) on different ECAs, we evaluated the relationship between the complexity of the rules' behavior and the intelligence exhibited by the LLMs, as reflected in their performance on downstream tasks. Our findings reveal that rules with higher complexity lead to models exhibiting greater intelligence, as demonstrated by their performance on reasoning and chess move prediction tasks. Both uniform and periodic systems, and often also highly chaotic systems, resulted in poorer downstream performance, highlighting a sweet spot of complexity conducive to intelligence. We conjecture that intelligence arises from the ability to predict complexity and that creating intelligence may require only exposure to complexity.
