AGITB: A Signal-Level Benchmark for Evaluating Artificial General Intelligence
Matej Šprogar
TL;DR
AGITB introduces a signal-level benchmark for artificial general intelligence that measures a model's ability to forecast the next input in temporal binary sequences without pretraining or semantic grounding. It defines fourteen interdependent requirements, including an unbiased initial state, determinism, and temporal adaptability, and applies an all-or-nothing, self-referential evaluation across 100 trials. The analysis contrasts humans, symbolic programs, artificial neural networks, and large language models, showing that current AI systems fail to satisfy all criteria, thereby highlighting a gap toward true AGI. Positioned against ARC and NeuroBench, AGITB emphasizes low-level, cortex-inspired invariants and provides an open-source reference implementation to guide progress toward general, adaptive learning, grounded in neural processing principles. The framework enables principled, interpretable assessment of generality beyond task-specific performance, with potential implications for NeuroAI and embodied cognition research.
Abstract
Current AI systems demonstrate remarkable capabilities yet remain specialised, in part because no unified measure of general intelligence has been established. Existing evaluation frameworks, which focus primarily on language or perception tasks, offer limited insight into generality. The Artificial General Intelligence Testbed (AGITB) introduces a complementary benchmarking suite of fourteen elementary tests, with thirteen implemented as fully automated procedures. AGITB evaluates models on their ability to forecast the next input in a temporal sequence, step by step, without pretraining, symbolic manipulation, or semantic grounding. The framework isolates core computational invariants, such as determinism, sensitivity, and generalisation, that parallel principles of biological information processing. Designed to resist brute-force or memorisation-based strategies, AGITB enforces unbiased and autonomous learning. The human cortex satisfies all tests, whereas no current AI system meets the full AGITB criteria, demonstrating its value as a rigorous, interpretable, and actionable benchmark for evaluating progress toward artificial general intelligence. A reference implementation of AGITB is freely available on GitHub.
