Table of Contents
Fetching ...

CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning

Weihang Guo, Zachary Kingston, Lydia E. Kavraki

TL;DR

CaStL is introduced, a framework that identifies constraints such as goal conditions, action ordering, and action blocking from natural language in multiple stages and translates these constraints into PDDL and Python scripts, which are then solved using an custom PDDL solver.

Abstract

Large Language Models (LLMs) have demonstrated remarkable ability in long-horizon Task and Motion Planning (TAMP) by translating clear and straightforward natural language problems into formal specifications such as the Planning Domain Definition Language (PDDL). However, real-world problems are often ambiguous and involve many complex constraints. In this paper, we introduce Constraints as Specifications through LLMs (CaStL), a framework that identifies constraints such as goal conditions, action ordering, and action blocking from natural language in multiple stages. CaStL translates these constraints into PDDL and Python scripts, which are solved using an custom PDDL solver. Tested across three PDDL domains, CaStL significantly improves constraint handling and planning success rates from natural language specification in complex scenarios.

CaStL: Constraints as Specifications through LLM Translation for Long-Horizon Task and Motion Planning

TL;DR

CaStL is introduced, a framework that identifies constraints such as goal conditions, action ordering, and action blocking from natural language in multiple stages and translates these constraints into PDDL and Python scripts, which are then solved using an custom PDDL solver.

Abstract

Large Language Models (LLMs) have demonstrated remarkable ability in long-horizon Task and Motion Planning (TAMP) by translating clear and straightforward natural language problems into formal specifications such as the Planning Domain Definition Language (PDDL). However, real-world problems are often ambiguous and involve many complex constraints. In this paper, we introduce Constraints as Specifications through LLMs (CaStL), a framework that identifies constraints such as goal conditions, action ordering, and action blocking from natural language in multiple stages. CaStL translates these constraints into PDDL and Python scripts, which are solved using an custom PDDL solver. Tested across three PDDL domains, CaStL significantly improves constraint handling and planning success rates from natural language specification in complex scenarios.

Paper Structure

This paper contains 20 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Our proposed method, CaStL, allows specification of Task and Motion Planning (TAMP) problems with constraints in natural language using a multi-step process (detailed in \ref{['sec:method']}). Here, a TAMP problem (Move the red block to another table) with an additional global constraint (Do not move the orange blocks) is specified. Our approach resolves ambiguities and breaks the problem down into a PDDL specification and set of constraints that are added to a SMT-based TAMP solver (IDTMP) dantam2018incremental with a Python API. This solver is capable of resolving motion constraints (here, the red block cannot be grasped without moving one colored pair of blocks out of the way). The color of each step corresponds to the module with the same color in \ref{['fig:three_methods']}.
  • Figure 2: Illustrations of approaches. (a) In our approach, CaStL, constraints are first extracted through multi-step LLM queries (\ref{['sec:refine']}). Then, the LLM translates natural language constraints into PDDL (\ref{['sec:translate_pddl']}) as well as Python scripts which use an API on our SMT-based PDDL solver within a TAMP algorithm, IDTMP dantam2018incremental (\ref{['sec:constraint_script']}). (b) CaStL One-step is an ablation of CaStL, without the multi-step LLM process for extracting constraints. (c) Baseline. The problem is decomposed into natural language subproblems, which are translated sequential into PDDL.
  • Figure 3: (a) The HC domain features a robot starting in Room 0, tasked with visiting a list of rooms, each of which locked by a corresponding key. (b) The BW domain, which consists of pick, place, stack, and unstack actions with a number of blocks and tables.