Table of Contents
Fetching ...

Mars: Situated Inductive Reasoning in an Open-World Environment

Xiaojuan Tang, Jiaqi Li, Yitao Liang, Song-chun Zhu, Muhan Zhang, Zilong Zheng

TL;DR

Through Mars, an interactive environment devised for situated inductive reasoning, this paper aims to galvanize advancements in situated inductive reasoning and set the stage for developing the next generation of AI systems that can reason in an adaptive and context-sensitive way.

Abstract

Large Language Models (LLMs) trained on massive corpora have shown remarkable success in knowledge-intensive tasks. Yet, most of them rely on pre-stored knowledge. Inducing new general knowledge from a specific environment and performing reasoning with the acquired knowledge -- \textit{situated inductive reasoning}, is crucial and challenging for machine intelligence. In this paper, we design Mars, an interactive environment devised for situated inductive reasoning. It introduces counter-commonsense game mechanisms by modifying terrain, survival setting and task dependency while adhering to certain principles. In Mars, agents need to actively interact with their surroundings, derive useful rules and perform decision-making tasks in specific contexts. We conduct experiments on various RL-based and LLM-based methods, finding that they all struggle on this challenging situated inductive reasoning benchmark. Furthermore, we explore \textit{Induction from Reflection}, where we instruct agents to perform inductive reasoning from history trajectory. The superior performance underscores the importance of inductive reasoning in Mars. Through Mars, we aim to galvanize advancements in situated inductive reasoning and set the stage for developing the next generation of AI systems that can reason in an adaptive and context-sensitive way.

Mars: Situated Inductive Reasoning in an Open-World Environment

TL;DR

Through Mars, an interactive environment devised for situated inductive reasoning, this paper aims to galvanize advancements in situated inductive reasoning and set the stage for developing the next generation of AI systems that can reason in an adaptive and context-sensitive way.

Abstract

Large Language Models (LLMs) trained on massive corpora have shown remarkable success in knowledge-intensive tasks. Yet, most of them rely on pre-stored knowledge. Inducing new general knowledge from a specific environment and performing reasoning with the acquired knowledge -- \textit{situated inductive reasoning}, is crucial and challenging for machine intelligence. In this paper, we design Mars, an interactive environment devised for situated inductive reasoning. It introduces counter-commonsense game mechanisms by modifying terrain, survival setting and task dependency while adhering to certain principles. In Mars, agents need to actively interact with their surroundings, derive useful rules and perform decision-making tasks in specific contexts. We conduct experiments on various RL-based and LLM-based methods, finding that they all struggle on this challenging situated inductive reasoning benchmark. Furthermore, we explore \textit{Induction from Reflection}, where we instruct agents to perform inductive reasoning from history trajectory. The superior performance underscores the importance of inductive reasoning in Mars. Through Mars, we aim to galvanize advancements in situated inductive reasoning and set the stage for developing the next generation of AI systems that can reason in an adaptive and context-sensitive way.

Paper Structure

This paper contains 65 sections, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Mars, an open-world environment for situated inductive reasoning, involves inductive reasoning through active interaction and applying newly acquired rules to make context-sensitive decisions. First, built on Crafter, we introduce counter-commonsense elements to design Mars. Agents interact with the environment and accumulate historical trajectories. For example, an agent might observe that regardless of time or location, mining stone always yields diamonds; using 2 diamonds can craft a table. Consequently, the agent can induce rules "Mining stone yields diamond" and "Placing table consumes 2 diamonds". When tasked with making a wooden pickaxe, the agent can apply these rules to plan and execute specific actions in different contexts.
  • Figure 2: Examples of three kinds of modification to commonsense elements. Please refer to Appendix \ref{['app:details_of_three']} for more details.
  • Figure 3: An illustration of the Induction from reflection pipeline for Mars. Given the selected task and the agent's observation, planner decomposes the task into a sequence of subgoals. Controller then outputs specific actions to accomplish these subgoals. Successful plans are stored in the skill library, while failed plans prompt the agent to perform self-explanation and replan. Rule library is updated through reflection on the controller's execution. By performing inductive reasoning, it saves possible game rules for proposer, planner, and controller using.
  • Figure 4: Success rate of unlocking 22 different achievements in log scale by Skill Library model.
  • Figure 5: Evaluation of rule library