Automatic Robot Task Planning by Integrating Large Language Model with Genetic Programming
Azizjon Kobilov, Jianglin Lan
TL;DR
The paper tackles automated robot task planning by integrating a Multimodal Large Language Model with Genetic Programming to generate and refine Behavior Tree-based policies from natural-language task descriptions and environmental images. The LLM provides an initial, diverse BT population that GP then evolves toward high fitness, using a validation step to curb hallucinations and ensure context relevance. Key contributions include environment-aware BT generation without predefined examples, fitness-guided filtering to accelerate evolution, and robustness under uncertainty with experiments across multiple scenarios and reduced initial populations. The results demonstrate faster convergence to high-fitness BTs and stable performance under varying conditions, highlighting the approach's practical potential for scalable, low-human-input task planning in autonomous systems.
Abstract
Accurate task planning is critical for controlling autonomous systems, such as robots, drones, and self-driving vehicles. Behavior Trees (BTs) are considered one of the most prominent control-policy-defining frameworks in task planning, due to their modularity, flexibility, and reusability. Generating reliable and accurate BT-based control policies for robotic systems remains challenging and often requires domain expertise. In this paper, we present the LLM-GP-BT technique that leverages the Large Language Model (LLM) and Genetic Programming (GP) to automate the generation and configuration of BTs. The LLM-GP-BT technique processes robot task commands expressed in human natural language and converts them into accurate and reliable BT-based task plans in a computationally efficient and user-friendly manner. The proposed technique is systematically developed and validated through simulation experiments, demonstrating its potential to streamline task planning for autonomous systems.
