Integrated Exploration and Sequential Manipulation on Scene Graph with LLM-based Situated Replanning
Heqing Yang, Ziyuan Jiao, Shu Wang, Yida Niu, Si Liu, Hangxin Liu
TL;DR
EPoG addresses the challenge of planning in partially known environments by unifying exploration with sequential manipulation on a graph-based scene representation. It uses a bi-level planning framework where a global planner maintains a belief graph and a local LLM-based replanner handles exceptions, guided by graph edit distance to transform the belief graph into the goal graph. Empirical results across 46 ProcThor-10k scenes show high success and substantial reductions in exploration and travel cost, and real-world mobile manipulator experiments demonstrate feasibility. The work highlights the value of integrating LLM priors with graph-based planning for robust, long-horizon robotic manipulation in unknown environments.
Abstract
In partially known environments, robots must combine exploration to gather information with task planning for efficient execution. To address this challenge, we propose EPoG, an Exploration-based sequential manipulation Planning framework on Scene Graphs. EPoG integrates a graph-based global planner with a Large Language Model (LLM)-based situated local planner, continuously updating a belief graph using observations and LLM predictions to represent known and unknown objects. Action sequences are generated by computing graph edit operations between the goal and belief graphs, ordered by temporal dependencies and movement costs. This approach seamlessly combines exploration and sequential manipulation planning. In ablation studies across 46 realistic household scenes and 5 long-horizon daily object transportation tasks, EPoG achieved a success rate of 91.3%, reducing travel distance by 36.1% on average. Furthermore, a physical mobile manipulator successfully executed complex tasks in unknown and dynamic environments, demonstrating EPoG's potential for real-world applications.
