CLMASP: Coupling Large Language Models with Answer Set Programming for Robotic Task Planning
Xinrui Lin, Yangfan Wu, Huanyu Yang, Yu Zhang, Yanyong Zhang, Jianmin Ji
TL;DR
This paper addresses the challenge of grounding open-world robotic task plans generated by Large Language Models. It proposes CLMASP, a two-stage approach that first uses an LLM to produce skeleton plans and then refines them with an ASP-based reasoning over a formal action model, augmented by a vector-grounding step to align plan references with the scene. On VirtualHome, CLMASP achieves a dramatic increase in executability, surpassing 90% compared with under 2% for vanilla LLM planning, and improves goal achievement by leveraging SR, RG, and ASP modules. The work demonstrates a scalable, training-free framework that flexibly integrates external knowledge and can be extended with automation for ASP generation, offering practical impact for robust robot task planning.
Abstract
Large Language Models (LLMs) possess extensive foundational knowledge and moderate reasoning abilities, making them suitable for general task planning in open-world scenarios. However, it is challenging to ground a LLM-generated plan to be executable for the specified robot with certain restrictions. This paper introduces CLMASP, an approach that couples LLMs with Answer Set Programming (ASP) to overcome the limitations, where ASP is a non-monotonic logic programming formalism renowned for its capacity to represent and reason about a robot's action knowledge. CLMASP initiates with a LLM generating a basic skeleton plan, which is subsequently tailored to the specific scenario using a vector database. This plan is then refined by an ASP program with a robot's action knowledge, which integrates implementation details into the skeleton, grounding the LLM's abstract outputs in practical robot contexts. Our experiments conducted on the VirtualHome platform demonstrate CLMASP's efficacy. Compared to the baseline executable rate of under 2% with LLM approaches, CLMASP significantly improves this to over 90%.
