Table of Contents
Fetching ...

Constrained Natural Language Action Planning for Resilient Embodied Systems

Grayson Byrd, Corban Rivera, Bethany Kemp, Meghan Booker, Aurora Schmidt, Celso M de Melo, Lalithkumar Seenivasan, Mathias Unberath

TL;DR

This work tackles the unreliability of purely LLM-based planning in embodied tasks by introducing SCLPlan, a hybrid system that binds LLM reasoning to a formal symbolic planner via a PDDL domain to enforce hard constraints. The approach preserves the adaptability and generalization of LLMs while providing explicit verification and decision boundaries through Precondition Verification and a Global Symbolic Planner, improving reliability, repeatability, and transparency. Across ALFWorld, AI2Thor, and real-world Spot experiments, SCLPlan achieves substantial gains in Task Success, reduces planning cost metrics, and demonstrates robust transfer from simulation to physical hardware, outperforming both purely LLM and purely symbolic baselines. The results suggest a practical path toward resilient, open-world capable embodied systems, with potential for broader applicability as symbolic augmentations and low-level control mature.

Abstract

Replicating human-level intelligence in the execution of embodied tasks remains challenging due to the unconstrained nature of real-world environments. Novel use of large language models (LLMs) for task planning seeks to address the previously intractable state/action space of complex planning tasks, but hallucinations limit their reliability, and thus, viability beyond a research context. Additionally, the prompt engineering required to achieve adequate system performance lacks transparency, and thus, repeatability. In contrast to LLM planning, symbolic planning methods offer strong reliability and repeatability guarantees, but struggle to scale to the complexity and ambiguity of real-world tasks. We introduce a new robotic planning method that augments LLM planners with symbolic planning oversight to improve reliability and repeatability, and provide a transparent approach to defining hard constraints with considerably stronger clarity than traditional prompt engineering. Importantly, these augmentations preserve the reasoning capabilities of LLMs and retain impressive generalization in open-world environments. We demonstrate our approach in simulated and real-world environments. On the ALFWorld planning benchmark, our approach outperforms current state-of-the-art methods, achieving a near-perfect 99% success rate. Deployment of our method to a real-world quadruped robot resulted in 100% task success compared to 50% and 30% for pure LLM and symbolic planners across embodied pick and place tasks. Our approach presents an effective strategy to enhance the reliability, repeatability and transparency of LLM-based robot planners while retaining their key strengths: flexibility and generalizability to complex real-world environments. We hope that this work will contribute to the broad goal of building resilient embodied intelligent systems.

Constrained Natural Language Action Planning for Resilient Embodied Systems

TL;DR

This work tackles the unreliability of purely LLM-based planning in embodied tasks by introducing SCLPlan, a hybrid system that binds LLM reasoning to a formal symbolic planner via a PDDL domain to enforce hard constraints. The approach preserves the adaptability and generalization of LLMs while providing explicit verification and decision boundaries through Precondition Verification and a Global Symbolic Planner, improving reliability, repeatability, and transparency. Across ALFWorld, AI2Thor, and real-world Spot experiments, SCLPlan achieves substantial gains in Task Success, reduces planning cost metrics, and demonstrates robust transfer from simulation to physical hardware, outperforming both purely LLM and purely symbolic baselines. The results suggest a practical path toward resilient, open-world capable embodied systems, with potential for broader applicability as symbolic augmentations and low-level control mature.

Abstract

Replicating human-level intelligence in the execution of embodied tasks remains challenging due to the unconstrained nature of real-world environments. Novel use of large language models (LLMs) for task planning seeks to address the previously intractable state/action space of complex planning tasks, but hallucinations limit their reliability, and thus, viability beyond a research context. Additionally, the prompt engineering required to achieve adequate system performance lacks transparency, and thus, repeatability. In contrast to LLM planning, symbolic planning methods offer strong reliability and repeatability guarantees, but struggle to scale to the complexity and ambiguity of real-world tasks. We introduce a new robotic planning method that augments LLM planners with symbolic planning oversight to improve reliability and repeatability, and provide a transparent approach to defining hard constraints with considerably stronger clarity than traditional prompt engineering. Importantly, these augmentations preserve the reasoning capabilities of LLMs and retain impressive generalization in open-world environments. We demonstrate our approach in simulated and real-world environments. On the ALFWorld planning benchmark, our approach outperforms current state-of-the-art methods, achieving a near-perfect 99% success rate. Deployment of our method to a real-world quadruped robot resulted in 100% task success compared to 50% and 30% for pure LLM and symbolic planners across embodied pick and place tasks. Our approach presents an effective strategy to enhance the reliability, repeatability and transparency of LLM-based robot planners while retaining their key strengths: flexibility and generalizability to complex real-world environments. We hope that this work will contribute to the broad goal of building resilient embodied intelligent systems.

Paper Structure

This paper contains 23 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Overview of our Symbolically Constrained Language Planner (SCLPlan). Design Phase: In the design phase, an engineer can specify the planning constraints of their system (i.e. the preconditions and effects of each actions) through a PDDL pddl_originalpddl2 environment domain. This engineer can then create an LLM Planning Prompt that provides natural language descriptions of each of the available actions. Planning Phase: During the planning phase, SCLPlan will first use an LLM to produce a PDDL goal state that can be used by its Symbolic Planner to generate a planning solution. Next, SCLPlan will sequentially plan to achieve the task, leveraging both LLM and Symbolic Planner components where necessary.
  • Figure 2: Task planning example from each experimental environment.
  • Figure 3: Ablation study of SCLPlan on ALFWorld task planning benchmark reveals significant increase in Task Success percentage, reduction in Token Count, and reduction in Environment Steps required to complete each task. These improvements persist across a variety of open source and cloud based LLMs of varying competence.
  • Figure 4: SCLPlan architecture.