Table of Contents
Fetching ...

Learning Generative Interactive Environments By Trained Agent Exploration

Naser Kazemi, Nedko Savov, Danda Paudel, Luc Van Gool

TL;DR

The model GenieRedux is released - an implementation based on Genie that is reproducable, scalable and adaptable to new types of environments and a variant that uses the agent's readily available actions to factor out action prediction uncertainty during validation.

Abstract

World models are increasingly pivotal in interpreting and simulating the rules and actions of complex environments. Genie, a recent model, excels at learning from visually diverse environments but relies on costly human-collected data. We observe that their alternative method of using random agents is too limited to explore the environment. We propose to improve the model by employing reinforcement learning based agents for data generation. This approach produces diverse datasets that enhance the model's ability to adapt and perform well across various scenarios and realistic actions within the environment. In this paper, we first release the model GenieRedux - an implementation based on Genie. Additionally, we introduce GenieRedux-G, a variant that uses the agent's readily available actions to factor out action prediction uncertainty during validation. Our evaluation, including a replication of the Coinrun case study, shows that GenieRedux-G achieves superior visual fidelity and controllability using the trained agent exploration. The proposed approach is reproducable, scalable and adaptable to new types of environments. Our codebase is available at https://github.com/insait-institute/GenieRedux .

Learning Generative Interactive Environments By Trained Agent Exploration

TL;DR

The model GenieRedux is released - an implementation based on Genie that is reproducable, scalable and adaptable to new types of environments and a variant that uses the agent's readily available actions to factor out action prediction uncertainty during validation.

Abstract

World models are increasingly pivotal in interpreting and simulating the rules and actions of complex environments. Genie, a recent model, excels at learning from visually diverse environments but relies on costly human-collected data. We observe that their alternative method of using random agents is too limited to explore the environment. We propose to improve the model by employing reinforcement learning based agents for data generation. This approach produces diverse datasets that enhance the model's ability to adapt and perform well across various scenarios and realistic actions within the environment. In this paper, we first release the model GenieRedux - an implementation based on Genie. Additionally, we introduce GenieRedux-G, a variant that uses the agent's readily available actions to factor out action prediction uncertainty during validation. Our evaluation, including a replication of the Coinrun case study, shows that GenieRedux-G achieves superior visual fidelity and controllability using the trained agent exploration. The proposed approach is reproducable, scalable and adaptable to new types of environments. Our codebase is available at https://github.com/insait-institute/GenieRedux .
Paper Structure (18 sections, 11 figures, 7 tables)

This paper contains 18 sections, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Architecture of our models. GenieRedux shares the architecture of Genie; GenieRedux-G takes agent actions as input instead of predicting them.
  • Figure 2: GenieRedux-G-TA Control Demonstration. GenieRedux-G-TA is able to consistently perform all environment actions. Here we demonstrate all of them as generated by the model.
  • Figure 3: Visual Fidelity Evaluation of GenieRedux, GenieRedux-G and their tokenizer, trained with random agent exploration (-Base), compared to training with trained agent exploration (-TA). Evaluation is done on Diverse Test Set.
  • Figure 4: GenieRedux-G-TA Qualitative Result. We give a single frame and actions from the test set and we generate 10 frames. In this example our model first successfully progresses the motion of falling. Then, it performs a jump. Ground truth frames are at the top; generated - at the bottom.
  • Figure 5: GenieRedux-Base Quantitative Evaluation. We present a few sequences from the test set with predictions from GenieRedux-Base. On the example at the top we show a successful jump action. On the example at the bottom we show a successful motion progression.
  • ...and 6 more figures