Constrained Natural Language Action Planning for Resilient Embodied Systems
Grayson Byrd, Corban Rivera, Bethany Kemp, Meghan Booker, Aurora Schmidt, Celso M de Melo, Lalithkumar Seenivasan, Mathias Unberath
TL;DR
This work tackles the unreliability of purely LLM-based planning in embodied tasks by introducing SCLPlan, a hybrid system that binds LLM reasoning to a formal symbolic planner via a PDDL domain to enforce hard constraints. The approach preserves the adaptability and generalization of LLMs while providing explicit verification and decision boundaries through Precondition Verification and a Global Symbolic Planner, improving reliability, repeatability, and transparency. Across ALFWorld, AI2Thor, and real-world Spot experiments, SCLPlan achieves substantial gains in Task Success, reduces planning cost metrics, and demonstrates robust transfer from simulation to physical hardware, outperforming both purely LLM and purely symbolic baselines. The results suggest a practical path toward resilient, open-world capable embodied systems, with potential for broader applicability as symbolic augmentations and low-level control mature.
Abstract
Replicating human-level intelligence in the execution of embodied tasks remains challenging due to the unconstrained nature of real-world environments. Novel use of large language models (LLMs) for task planning seeks to address the previously intractable state/action space of complex planning tasks, but hallucinations limit their reliability, and thus, viability beyond a research context. Additionally, the prompt engineering required to achieve adequate system performance lacks transparency, and thus, repeatability. In contrast to LLM planning, symbolic planning methods offer strong reliability and repeatability guarantees, but struggle to scale to the complexity and ambiguity of real-world tasks. We introduce a new robotic planning method that augments LLM planners with symbolic planning oversight to improve reliability and repeatability, and provide a transparent approach to defining hard constraints with considerably stronger clarity than traditional prompt engineering. Importantly, these augmentations preserve the reasoning capabilities of LLMs and retain impressive generalization in open-world environments. We demonstrate our approach in simulated and real-world environments. On the ALFWorld planning benchmark, our approach outperforms current state-of-the-art methods, achieving a near-perfect 99% success rate. Deployment of our method to a real-world quadruped robot resulted in 100% task success compared to 50% and 30% for pure LLM and symbolic planners across embodied pick and place tasks. Our approach presents an effective strategy to enhance the reliability, repeatability and transparency of LLM-based robot planners while retaining their key strengths: flexibility and generalizability to complex real-world environments. We hope that this work will contribute to the broad goal of building resilient embodied intelligent systems.
