Table of Contents
Fetching ...

OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code

Maxence Faldor, Jenny Zhang, Antoine Cully, Jeff Clune

TL;DR

OMNI-EPIC proposes a scalable, open-ended curriculum framework that jointly generates tasks, environments, and rewards via foundation models, while validating novelty and learnability with MoIs and a retrieval-based post-check. By combining a growing task archive, language- and code-based environment synthesis, and RL training on specialist agents, the approach aims to realize Darwin Completeness in a constrained simulator. Empirical results show increased task diversity, sustained progress, and strong alignment between automated success detectors and human judgments, suggesting a viable path toward endlessly inventing and mastering new learning challenges. The work highlights both the potential and current limitations of fully automated, code-generated environments, and points to future directions for generalist agents and broader simulation platforms.

Abstract

Open-ended and AI-generating algorithms aim to continuously generate and solve increasingly complex tasks indefinitely, offering a promising path toward more general intelligence. To accomplish this grand vision, learning must occur within a vast array of potential tasks. Existing approaches to automatically generating environments are constrained within manually predefined, often narrow distributions of environment, limiting their ability to create any learning environment. To address this limitation, we introduce a novel framework, OMNI-EPIC, that augments previous work in Open-endedness via Models of human Notions of Interestingness (OMNI) with Environments Programmed in Code (EPIC). OMNI-EPIC leverages foundation models to autonomously generate code specifying the next learnable (i.e., not too easy or difficult for the agent's current skill set) and interesting (e.g., worthwhile and novel) tasks. OMNI-EPIC generates both environments (e.g., an obstacle course) and reward functions (e.g., progress through the obstacle course quickly without touching red objects), enabling it, in principle, to create any simulatable learning task. We showcase the explosive creativity of OMNI-EPIC, which continuously innovates to suggest new, interesting learning challenges. We also highlight how OMNI-EPIC can adapt to reinforcement learning agents' learning progress, generating tasks that are of suitable difficulty. Overall, OMNI-EPIC can endlessly create learnable and interesting environments, further propelling the development of self-improving AI systems and AI-Generating Algorithms. Project website with videos: https://dub.sh/omniepic

OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code

TL;DR

OMNI-EPIC proposes a scalable, open-ended curriculum framework that jointly generates tasks, environments, and rewards via foundation models, while validating novelty and learnability with MoIs and a retrieval-based post-check. By combining a growing task archive, language- and code-based environment synthesis, and RL training on specialist agents, the approach aims to realize Darwin Completeness in a constrained simulator. Empirical results show increased task diversity, sustained progress, and strong alignment between automated success detectors and human judgments, suggesting a viable path toward endlessly inventing and mastering new learning challenges. The work highlights both the potential and current limitations of fully automated, code-generated environments, and points to future directions for generalist agents and broader simulation platforms.

Abstract

Open-ended and AI-generating algorithms aim to continuously generate and solve increasingly complex tasks indefinitely, offering a promising path toward more general intelligence. To accomplish this grand vision, learning must occur within a vast array of potential tasks. Existing approaches to automatically generating environments are constrained within manually predefined, often narrow distributions of environment, limiting their ability to create any learning environment. To address this limitation, we introduce a novel framework, OMNI-EPIC, that augments previous work in Open-endedness via Models of human Notions of Interestingness (OMNI) with Environments Programmed in Code (EPIC). OMNI-EPIC leverages foundation models to autonomously generate code specifying the next learnable (i.e., not too easy or difficult for the agent's current skill set) and interesting (e.g., worthwhile and novel) tasks. OMNI-EPIC generates both environments (e.g., an obstacle course) and reward functions (e.g., progress through the obstacle course quickly without touching red objects), enabling it, in principle, to create any simulatable learning task. We showcase the explosive creativity of OMNI-EPIC, which continuously innovates to suggest new, interesting learning challenges. We also highlight how OMNI-EPIC can adapt to reinforcement learning agents' learning progress, generating tasks that are of suitable difficulty. Overall, OMNI-EPIC can endlessly create learnable and interesting environments, further propelling the development of self-improving AI systems and AI-Generating Algorithms. Project website with videos: https://dub.sh/omniepic
Paper Structure (37 sections, 43 figures, 2 tables)

This paper contains 37 sections, 43 figures, 2 tables.

Figures (43)

  • Figure 1: OMNI-EPIC overview. OMNI-EPIC continuously generates and solves new, interesting tasks in simulation. Our approach maintains a task archive of learned and failed tasks.
  • Figure 2: Long Run with Simulated Learning. OMNI-EPIC generates a diverse array of tasks, ranging from wildly different objectives to interesting variations of similar overarching tasks. The node color reflects the generation number of the task. A check mark in the node means that the task was successfully learned. A ZZZ symbol means that the task was deemed uninteresting and discarded. The node connections illustrate which tasks were conditioned on when asking an FM to generate a similar yet new and interesting task. Grey nodes show task description seeds that initialized the run.
  • Figure 3: Short Run with Learning. OMNI-EPIC adapts to the current capabilities of trained RL agents, generating tasks that are both interesting and learnable. Tasks deemed interesting that are successfully learned are marked by a check and failures by a cross. Uninteresting tasks are not trained on and hence not included here. Arrows between tasks indicate instances where OMNI-EPIC modified a task that the RL agent failed to learn, adjusting the task difficulty to facilitate learning.
  • Figure 4: OMNI-EPIC generates significantly more diverse tasks and continues to innovate throughout the run. (Left) Cell coverage of archive diversity plots in long runs with simulated learning by OMNI-EPIC and the controls. (Right) ANNECS-OMNI measure of progress for OMNI-EPIC and the controls. Dotted lines are median values, shaded regions are 95% confidence intervals.
  • Figure 5: OMNI-EPIC in a game interface.
  • ...and 38 more figures