Table of Contents
Fetching ...

GSCE: A Prompt Framework with Enhanced Reasoning for Reliable LLM-driven Drone Control

Wenhao Wang, Yanyan Li, Long Jiao, Jiawei Yuan

TL;DR

This work tackles the reliability gap in LLM-driven drone control by introducing GSCE, a four-component prompt framework consisting of Guidelines, Skill APIs, Constraints, and Examples to enhance reasoning and constraint-compliance. By embedding NL constraints and example-based CoT reasoning within prompts, GSCE guides the LLM to generate executable drone control code that adheres to safety and operational constraints. Empirical evaluation in a realistic AirSim environment across 44 complex tasks demonstrates that GSCE substantially improves both task success rates and the correctness of intermediate state transitions, outperforming baselines that use either constraints or examples alone. The results suggest that integrating NL constraints with in-context learning offers a practical pathway to reliable LLM-powered autonomous drone systems with broad applicability to other robotic platforms.

Abstract

The integration of Large Language Models (LLMs) into robotic control, including drones, has the potential to revolutionize autonomous systems. Research studies have demonstrated that LLMs can be leveraged to support robotic operations. However, when facing tasks with complex reasoning, concerns and challenges are raised about the reliability of solutions produced by LLMs. In this paper, we propose a prompt framework with enhanced reasoning to enable reliable LLM-driven control for drones. Our framework consists of novel technical components designed using Guidelines, Skill APIs, Constraints, and Examples, namely GSCE. GSCE is featured by its reliable and constraint-compliant code generation. We performed thorough experiments using GSCE for the control of drones with a wide level of task complexities. Our experiment results demonstrate that GSCE can significantly improve task success rates and completeness compared to baseline approaches, highlighting its potential for reliable LLM-driven autonomous drone systems.

GSCE: A Prompt Framework with Enhanced Reasoning for Reliable LLM-driven Drone Control

TL;DR

This work tackles the reliability gap in LLM-driven drone control by introducing GSCE, a four-component prompt framework consisting of Guidelines, Skill APIs, Constraints, and Examples to enhance reasoning and constraint-compliance. By embedding NL constraints and example-based CoT reasoning within prompts, GSCE guides the LLM to generate executable drone control code that adheres to safety and operational constraints. Empirical evaluation in a realistic AirSim environment across 44 complex tasks demonstrates that GSCE substantially improves both task success rates and the correctness of intermediate state transitions, outperforming baselines that use either constraints or examples alone. The results suggest that integrating NL constraints with in-context learning offers a practical pathway to reliable LLM-powered autonomous drone systems with broad applicability to other robotic platforms.

Abstract

The integration of Large Language Models (LLMs) into robotic control, including drones, has the potential to revolutionize autonomous systems. Research studies have demonstrated that LLMs can be leveraged to support robotic operations. However, when facing tasks with complex reasoning, concerns and challenges are raised about the reliability of solutions produced by LLMs. In this paper, we propose a prompt framework with enhanced reasoning to enable reliable LLM-driven control for drones. Our framework consists of novel technical components designed using Guidelines, Skill APIs, Constraints, and Examples, namely GSCE. GSCE is featured by its reliable and constraint-compliant code generation. We performed thorough experiments using GSCE for the control of drones with a wide level of task complexities. Our experiment results demonstrate that GSCE can significantly improve task success rates and completeness compared to baseline approaches, highlighting its potential for reliable LLM-driven autonomous drone systems.

Paper Structure

This paper contains 30 sections, 3 figures, 4 tables.

Figures (3)

  • Figure 1: Illustration of LLM-driven drone control systems prompt structures. Structure (a) prompts LLMs with guidelines and skill APIs, as demonstrated in ChatGPTRoboticsCodeasPolicies. Our proposed GSCE structure (b) introduces constraints to regulate LLM behavior and incorporates examples with constraints implementation to illustrate LLM, thereby enhancing the reasoning capabilities of LLMs.
  • Figure 2: Our method starts by establishing the GSCE framework with guidelines, skill APIs, constraints, and examples. The user then provides queries (task descriptions) to the LLM for code generation. The generated code is extracted from the output context and executed to control the drone. Within the generated code, the LLM generates a sequence of actions that employ step-by-step reasoning and implement the constraints correctly.
  • Figure 3: Success Rate and Completeness across varying numbers of examples in the GSCE framework. Both GPT-4-Turbo and GPT-4o exhibit performance improvements as the number of examples increases. However, the improvement becomes marginal when the number of examples exceeds three.