Assessing Adaptive World Models in Machines with Novel Games

Lance Ying; Katherine M. Collins; Prafull Sharma; Cedric Colas; Kaiya Ivy Zhao; Adrian Weller; Zenna Tavares; Phillip Isola; Samuel J. Gershman; Jacob D. Andreas; Thomas L. Griffiths; Francois Chollet; Kelsey R. Allen; Joshua B. Tenenbaum

Assessing Adaptive World Models in Machines with Novel Games

Lance Ying, Katherine M. Collins, Prafull Sharma, Cedric Colas, Kaiya Ivy Zhao, Adrian Weller, Zenna Tavares, Phillip Isola, Samuel J. Gershman, Jacob D. Andreas, Thomas L. Griffiths, Francois Chollet, Kelsey R. Allen, Joshua B. Tenenbaum

TL;DR

This work argues that human-like rapid adaptation hinges on adaptive world models and introduces world model induction as a core capacity. It proposes a novel-game benchmark framework grounded in cognitive-science principles, using hierarchical world-model induction and active exploration to assess how quickly and robustly AI systems build and revise internal environment models. The paper details desiderata for designing such games, outlines a generative framework to continually produce novel challenges, and presents metrics for evaluating both behavioral performance and the underlying world models. If adopted, this paradigm could drive progress toward AI with improved sample efficiency, generalization, and human-like adaptability, contributing to the quest for artificial general intelligence.

Abstract

Human intelligence exhibits a remarkable capacity for rapid adaptation and effective problem-solving in novel and unfamiliar contexts. We argue that this profound adaptability is fundamentally linked to the efficient construction and refinement of internal representations of the environment, commonly referred to as world models, and we refer to this adaptation mechanism as world model induction. However, current understanding and evaluation of world models in artificial intelligence (AI) remains narrow, often focusing on static representations learned from training on massive corpora of data, instead of the efficiency and efficacy in learning these representations through interaction and exploration within a novel environment. In this Perspective, we provide a view of world model induction drawing on decades of research in cognitive science on how humans learn and adapt so efficiently; we then call for a new evaluation framework for assessing adaptive world models in AI. Concretely, we propose a new benchmarking paradigm based on suites of carefully designed games with genuine, deep and continually refreshing novelty in the underlying game structures -- we refer to this class of games as novel games. We detail key desiderata for constructing these games and propose appropriate metrics to explicitly challenge and evaluate the agent's ability for rapid world model induction. We hope that this new evaluation framework will inspire future evaluation efforts on world models in AI and provide a crucial step towards developing AI systems capable of human-like rapid adaptation and robust generalization -- a critical component of artificial general intelligence.

Assessing Adaptive World Models in Machines with Novel Games

TL;DR

Abstract

Assessing Adaptive World Models in Machines with Novel Games

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)