A Definition of Open-Ended Learning Problems for Goal-Conditioned Agents
Olivier Sigaud, Gianluca Baldassarre, Cedric Colas, Stephane Doncieux, Richard Duro, Pierre-Yves Oudeyer, Nicolas Perrin-Gilbert, Vieri Giuliano Santucci
TL;DR
The paper addresses the lack of a formal, consensus definition for open-ended learning by isolating an elementary property: the periodic emergence of novelty from an observer's perspective over an infinite horizon. It formalizes open-ended learning problems and, in particular, open-ended goal-conditioned reinforcement learning (GCRL), introducing first-order and second-order variants of open-ended GCRL. It then shows how to combine OEL with lifelong and autotelic or teachable goal-generation ideas, and discusses evaluation strategies and practical limitations. The work provides a principled framework to study evolving goal spaces and curricula, with implications for developmental AI and continual-learning research, while highlighting the need for explicit progress metrics and richer goal-discovery mechanisms in future work.
Abstract
A lot of recent machine learning research papers have ``open-ended learning'' in their title. But very few of them attempt to define what they mean when using the term. Even worse, when looking more closely there seems to be no consensus on what distinguishes open-ended learning from related concepts such as continual learning, lifelong learning or autotelic learning. In this paper, we contribute to fixing this situation. After illustrating the genealogy of the concept and more recent perspectives about what it truly means, we outline that open-ended learning is generally conceived as a composite notion encompassing a set of diverse properties. In contrast with previous approaches, we propose to isolate a key elementary property of open-ended processes, which is to produce elements from time to time (e.g., observations, options, reward functions, and goals), over an infinite horizon, that are considered novel from an observer's perspective. From there, we build the notion of open-ended learning problems and focus in particular on the subset of open-ended goal-conditioned reinforcement learning problems in which agents can learn a growing repertoire of goal-driven skills. Finally, we highlight the work that remains to be performed to fill the gap between our elementary definition and the more involved notions of open-ended learning that developmental AI researchers may have in mind.
