OMNI: Open-endedness via Models of human Notions of Interestingness
Jenny Zhang, Joel Lehman, Kenneth Stanley, Jeff Clune
TL;DR
OMNI tackles the Achilles' Heel of open-ended learning by encoding human notions of interestingness into a Model of Interestingness (MoI) derived from foundation models, and combining it with a Learning Progress Curriculum (LP) to prioritize tasks that are both learnable and worthwhile. The approach is validated in finite-task domains (Crafter, BabyAI) and an infinite-task space (AI2-THOR), where OMNI consistently outperforms uniform sampling and LP alone, and approaches oracle MoI performance. In the infinite-task setting, a GPT-4 driven task generator paired with LP demonstrates sustained discovery of learnable tasks, with MoI-filtering further enhancing efficiency and diversity. The results suggest a general, scalable recipe for auto-curricula that leverages human-aligned judgments to steer open-ended exploration toward meaningful progress, while highlighting avenues for safety and refinement via human feedback.
Abstract
Open-ended algorithms aim to learn new, interesting behaviors forever. That requires a vast environment search space, but there are thus infinitely many possible tasks. Even after filtering for tasks the current agent can learn (i.e., learning progress), countless learnable yet uninteresting tasks remain (e.g., minor variations of previously learned tasks). An Achilles Heel of open-endedness research is the inability to quantify (and thus prioritize) tasks that are not just learnable, but also $\textit{interesting}$ (e.g., worthwhile and novel). We propose solving this problem by $\textit{Open-endedness via Models of human Notions of Interestingness}$ (OMNI). The insight is that we can utilize foundation models (FMs) as a model of interestingness (MoI), because they $\textit{already}$ internalize human concepts of interestingness from training on vast amounts of human-generated data, where humans naturally write about what they find interesting or boring. We show that FM-based MoIs improve open-ended learning by focusing on tasks that are both learnable $\textit{and interesting}$, outperforming baselines based on uniform task sampling or learning progress alone. This approach has the potential to dramatically advance the ability to intelligently select which tasks to focus on next (i.e., auto-curricula), and could be seen as AI selecting its own next task to learn, facilitating self-improving AI and AI-Generating Algorithms. Project website at https://www.jennyzhangzt.com/omni/
