SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models
Yi Wu, Zikang Xiong, Yiran Hu, Shreyash S. Iyengar, Nan Jiang, Aniket Bera, Lin Tan, Suresh Jagannathan
TL;DR
SELP tackles the challenge of safe and efficient long-horizon robotic planning under complex natural-language commands by fusing three techniques: equivalence voting to robustly translate NL to LTL specifications, LTL-enforced constrained decoding to prune unsafe plan actions via a Büchi automaton, and domain-specific fine-tuning to bias planners toward efficient, safe plans. The approach yields two new datasets, DroneNav and TabletopManip, and demonstrates superior safety and speed over state-of-the-art LLM planners across drone navigation and tabletop manipulation tasks, with notable improvements in translation accuracy thanks to voting. Key contributions include a robust NL-to-LTL translation method, a practical constrained decoding mechanism that enforces temporal logic during inference, and a fine-tuning strategy that aligns planning with safety and efficiency goals. The work’s results suggest SELP’s techniques generalize across domains and offer a tangible path toward reliable NL-driven robotic planning in real-world settings; future work will explore energy efficiency and multi-modal perception integration.
Abstract
Despite significant advancements in large language models (LLMs) that enhance robot agents' understanding and execution of natural language (NL) commands, ensuring the agents adhere to user-specified constraints remains challenging, particularly for complex commands and long-horizon tasks. To address this challenge, we present three key insights, equivalence voting, constrained decoding, and domain-specific fine-tuning, which significantly enhance LLM planners' capability in handling complex tasks. Equivalence voting ensures consistency by generating and sampling multiple Linear Temporal Logic (LTL) formulas from NL commands, grouping equivalent LTL formulas, and selecting the majority group of formulas as the final LTL formula. Constrained decoding then uses the generated LTL formula to enforce the autoregressive inference of plans, ensuring the generated plans conform to the LTL. Domain-specific fine-tuning customizes LLMs to produce safe and efficient plans within specific task domains. Our approach, Safe Efficient LLM Planner (SELP), combines these insights to create LLM planners to generate plans adhering to user commands with high confidence. We demonstrate the effectiveness and generalizability of SELP across different robot agents and tasks, including drone navigation and robot manipulation. For drone navigation tasks, SELP outperforms state-of-the-art planners by 10.8% in safety rate (i.e., finishing tasks conforming to NL commands) and by 19.8% in plan efficiency. For robot manipulation tasks, SELP achieves 20.4% improvement in safety rate. Our datasets for evaluating NL-to-LTL and robot task planning will be released in github.com/lt-asset/selp.
