Table of Contents
Fetching ...

Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning

Heqing Yang, Ziyuan Jiao, Shu Wang, Yida Niu, Si Liu, Hangxin Liu

TL;DR

EPoG addresses the challenge of planning in partially known environments by unifying exploration with sequential manipulation on a graph-based scene representation. It uses a bi-level planning framework where a global planner maintains a belief graph and a local LLM-based replanner handles exceptions, guided by graph edit distance to transform the belief graph into the goal graph. Empirical results across 46 ProcThor-10k scenes show high success and substantial reductions in exploration and travel cost, and real-world mobile manipulator experiments demonstrate feasibility. The work highlights the value of integrating LLM priors with graph-based planning for robust, long-horizon robotic manipulation in unknown environments.

Abstract

In partially known environments, robots must combine exploration to gather information with task planning for efficient execution. To address this challenge, we propose EPoG, an Exploration-based sequential manipulation Planning framework on Scene Graphs. EPoG integrates a graph-based global planner with a Large Language Model (LLM)-based situated local planner, continuously updating a belief graph using observations and LLM predictions to represent known and unknown objects. Action sequences are generated by computing graph edit operations between the goal and belief graphs, ordered by temporal dependencies and movement costs. This approach seamlessly combines exploration and sequential manipulation planning. In ablation studies across 46 realistic household scenes and 5 long-horizon daily object transportation tasks, EPoG achieved a success rate of 91.3%, reducing travel distance by 36.1% on average. Furthermore, a physical mobile manipulator successfully executed complex tasks in unknown and dynamic environments, demonstrating EPoG's potential for real-world applications.

Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning

TL;DR

EPoG addresses the challenge of planning in partially known environments by unifying exploration with sequential manipulation on a graph-based scene representation. It uses a bi-level planning framework where a global planner maintains a belief graph and a local LLM-based replanner handles exceptions, guided by graph edit distance to transform the belief graph into the goal graph. Empirical results across 46 ProcThor-10k scenes show high success and substantial reductions in exploration and travel cost, and real-world mobile manipulator experiments demonstrate feasibility. The work highlights the value of integrating LLM priors with graph-based planning for robust, long-horizon robotic manipulation in unknown environments.

Abstract

In partially known environments, robots must combine exploration to gather information with task planning for efficient execution. To address this challenge, we propose EPoG, an Exploration-based sequential manipulation Planning framework on Scene Graphs. EPoG integrates a graph-based global planner with a Large Language Model (LLM)-based situated local planner, continuously updating a belief graph using observations and LLM predictions to represent known and unknown objects. Action sequences are generated by computing graph edit operations between the goal and belief graphs, ordered by temporal dependencies and movement costs. This approach seamlessly combines exploration and sequential manipulation planning. In ablation studies across 46 realistic household scenes and 5 long-horizon daily object transportation tasks, EPoG achieved a success rate of 91.3%, reducing travel distance by 36.1% on average. Furthermore, a physical mobile manipulator successfully executed complex tasks in unknown and dynamic environments, demonstrating EPoG's potential for real-world applications.
Paper Structure (15 sections, 1 equation, 7 figures, 2 tables, 2 algorithms)

This paper contains 15 sections, 1 equation, 7 figures, 2 tables, 2 algorithms.

Figures (7)

  • Figure 1: An example illustrating the challenges of integrating exploration and sequential manipulation: (a) Robots must prioritize potential exploration locations and balance exploration with manipulation tasks to execute efficiently. (b) The robot needs to engage in situated planning to handle unexpected situations.
  • Figure 2: Overview of the proposed EPoG framework. In the global planner, the belief graph is updated by estimating the locations of target objects present in the goal graph but missing in the initial graph, using new observations and llm predictions. The action plan is then obtained by performing the topological sort on the graph edit operations between the belief and the goal graph. In the local planner, the llm generates a situated action sequence to handle exceptions encountered during execution.
  • Figure 3: An example illustration of an indoor scene.
  • Figure 4: Examples of exceptions in motion planning. (a) Blocking: The robot must avoid collisions with other objects while placing or picking up objects. (b) Inaccessibility: Successful object retrieval or placement within a container requires the container to be opened first. (c) Collision: The robot must ensure that placed objects do not collide with the environment. (d) Instability: The robot must maintain the stability of stacked objects during manipulation; for example, retrieving a book beneath a cup may cause instability.
  • Figure 5: Prompt templates used by LLMPlanner.Chain of thought (CoT) prompts, and action primitives prompts.
  • ...and 2 more figures