Table of Contents
Fetching ...

CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning

Xinrui Lin, Yangfan Wu, Huanyu Yang, Yu Zhang, Yanyong Zhang, Jianmin Ji

TL;DR

This paper addresses the challenge of grounding open-world robotic task plans generated by Large Language Models. It proposes CLMASP, a two-stage approach that first uses an LLM to produce skeleton plans and then refines them with an ASP-based reasoning over a formal action model, augmented by a vector-grounding step to align plan references with the scene. On VirtualHome, CLMASP achieves a dramatic increase in executability, surpassing 90% compared with under 2% for vanilla LLM planning, and improves goal achievement by leveraging SR, RG, and ASP modules. The work demonstrates a scalable, training-free framework that flexibly integrates external knowledge and can be extended with automation for ASP generation, offering practical impact for robust robot task planning.

Abstract

Large Language Models (LLMs) possess extensive foundational knowledge and moderate reasoning abilities, making them suitable for general task planning in open-world scenarios. However, it is challenging to ground a LLM-generated plan to be executable for the specified robot with certain restrictions. This paper introduces CLMASP, an approach that couples LLMs with Answer Set Programming (ASP) to overcome the limitations, where ASP is a non-monotonic logic programming formalism renowned for its capacity to represent and reason about a robot's action knowledge. CLMASP initiates with a LLM generating a basic skeleton plan, which is subsequently tailored to the specific scenario using a vector database. This plan is then refined by an ASP program with a robot's action knowledge, which integrates implementation details into the skeleton, grounding the LLM's abstract outputs in practical robot contexts. Our experiments conducted on the VirtualHome platform demonstrate CLMASP's efficacy. Compared to the baseline executable rate of under 2% with LLM approaches, CLMASP significantly improves this to over 90%.

CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning

TL;DR

This paper addresses the challenge of grounding open-world robotic task plans generated by Large Language Models. It proposes CLMASP, a two-stage approach that first uses an LLM to produce skeleton plans and then refines them with an ASP-based reasoning over a formal action model, augmented by a vector-grounding step to align plan references with the scene. On VirtualHome, CLMASP achieves a dramatic increase in executability, surpassing 90% compared with under 2% for vanilla LLM planning, and improves goal achievement by leveraging SR, RG, and ASP modules. The work demonstrates a scalable, training-free framework that flexibly integrates external knowledge and can be extended with automation for ASP generation, offering practical impact for robust robot task planning.

Abstract

Large Language Models (LLMs) possess extensive foundational knowledge and moderate reasoning abilities, making them suitable for general task planning in open-world scenarios. However, it is challenging to ground a LLM-generated plan to be executable for the specified robot with certain restrictions. This paper introduces CLMASP, an approach that couples LLMs with Answer Set Programming (ASP) to overcome the limitations, where ASP is a non-monotonic logic programming formalism renowned for its capacity to represent and reason about a robot's action knowledge. CLMASP initiates with a LLM generating a basic skeleton plan, which is subsequently tailored to the specific scenario using a vector database. This plan is then refined by an ASP program with a robot's action knowledge, which integrates implementation details into the skeleton, grounding the LLM's abstract outputs in practical robot contexts. Our experiments conducted on the VirtualHome platform demonstrate CLMASP's efficacy. Compared to the baseline executable rate of under 2% with LLM approaches, CLMASP significantly improves this to over 90%.
Paper Structure (28 sections, 19 equations, 1 figure, 1 table, 1 algorithm)

This paper contains 28 sections, 19 equations, 1 figure, 1 table, 1 algorithm.

Figures (1)

  • Figure 1: The flowchart of CLMASP applied to the "Wash Clothes" task. In the flowchart, program modules are represented by boxes, with arrows indicating the direction of data flow. Different types of data are distinguished by colored boxes above the arrows: flesh color for plans, purple for the action model, blue for the robot's observed states, and gray for prompts. The methods represented are LLM original output (Vanilla), Self-Refinement (SR), Referring-Grounding (RG), and ASP Programming (ASP), with CLMASP integrating all three advanced methods (SR, RG, and ASP) for enhanced processing. Following the data flow in CLMASP, the initial plan is generated and self-corrected for verb errors via LLM prompts, with incorrect nouns replaced through nearest vector search. In the ASP segment, while human experts are still required to extract the Causal Model and translate it into ASP rules, the translation of the skeleton plan and robot observations is fully automated. Remarkably, the execution rate of the plan developed by CLMASP can exceed 90%.