Integrating Intent Understanding and Optimal Behavior Planning for Behavior Tree Generation from Human Instructions
Xinglin Chen, Yishuai Cai, Yunxin Mao, Minglong Li, Wenjing Yang, Weixia Xu, Ji Wang
TL;DR
This work tackles translating human natural-language instructions into reliable robot behaviors by coupling intent understanding with formal planning. It introduces a two-stage framework: Stage 1 uses large language models to translate instructions into a goal expressed as well-formed formulas in first-order logic, and Stage 2 applies the Optimal Behavior Tree Expansion Algorithm (OBTEA) to construct a finite-time, cost-minimizing BT that guarantees goal achievement. The FO-formalization (O, P_c, P_a) and the use of Disjunctive Normal Form enable precise goal decomposition and modular BT generation, while reflective feedback and prompt engineering enhance LLM reliability. Empirical validation in a café-like service scenario demonstrates that OBTEA produces lower-cost, more efficient BTs than baselines, and deployment in a digital twin environment shows practical applicability for embodied intelligent agents. The framework advances interpretable, reliable, and adaptable robot planning by integrating language grounding with symbolically grounded behavior synthesis, with potential impact on domestic and industrial robotics.
Abstract
Robots executing tasks following human instructions in domestic or industrial environments essentially require both adaptability and reliability. Behavior Tree (BT) emerges as an appropriate control architecture for these scenarios due to its modularity and reactivity. Existing BT generation methods, however, either do not involve interpreting natural language or cannot theoretically guarantee the BTs' success. This paper proposes a two-stage framework for BT generation, which first employs large language models (LLMs) to interpret goals from high-level instructions, then constructs an efficient goal-specific BT through the Optimal Behavior Tree Expansion Algorithm (OBTEA). We represent goals as well-formed formulas in first-order logic, effectively bridging intent understanding and optimal behavior planning. Experiments in the service robot validate the proficiency of LLMs in producing grammatically correct and accurately interpreted goals, demonstrate OBTEA's superiority over the baseline BT Expansion algorithm in various metrics, and finally confirm the practical deployability of our framework. The project website is https://dids-ei.github.io/Project/LLM-OBTEA/.
