Measuring Intelligence through Games
Tom Schaul, Julian Togelius, Jürgen Schmidhuber
TL;DR
The paper tackles the lack of a practical, general benchmark for AGI by extending Legg–Hutter's universal intelligence to finite time and restricting evaluation to a computable space of games described in a suitable language. It proposes a two-phase, anytime-measure that uses biased Monte Carlo sampling over game environments weighted by environment complexity and time, yielding a scalable and fair comparison across diverse AI approaches. Key contributions include a concrete framework for time-aware intelligence measurement, explicit incorporation of resource constraints, and a discussion of game description languages and competitions to realize general game intelligence testing. The approach aims to enable broad, interpretable, cross-domain assessments of general intelligence with practical applicability and room for future competition-based refinement.
Abstract
Artificial general intelligence (AGI) refers to research aimed at tackling the full problem of artificial intelligence, that is, create truly intelligent agents. This sets it apart from most AI research which aims at solving relatively narrow domains, such as character recognition, motion planning, or increasing player satisfaction in games. But how do we know when an agent is truly intelligent? A common point of reference in the AGI community is Legg and Hutter's formal definition of universal intelligence, which has the appeal of simplicity and generality but is unfortunately incomputable. Games of various kinds are commonly used as benchmarks for "narrow" AI research, as they are considered to have many important properties. We argue that many of these properties carry over to the testing of general intelligence as well. We then sketch how such testing could practically be carried out. The central part of this sketch is an extension of universal intelligence to deal with finite time, and the use of sampling of the space of games expressed in a suitably biased game description language.
