Table of Contents
Fetching ...

Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations

Wenhao Wang, Yanyan Li, Long Jiao, Jiawei Yuan

TL;DR

This work addresses the reliability gap in LLM-driven UAV operation within IoT environments by introducing a closed-loop framework that uses two specialized LLMs—a Code Generator and an Evaluator—paired with simulation-based refinement and semantic trajectory observations. The Code Generator proposes UAV operation code, which is validated in simulation; numerical state observations are transformed into semantic trajectories to improve evaluator understanding and feedback quality. The Evaluator, guided by structured prompts and references, supplies corrective feedback that iteratively refines the code until task objectives are met, mitigating risks of unsafe exploration in the real world. Empirical results show strong performance, with 100% SR and 100% completeness on Basic tasks and 85% SR with 98.5% completeness on Advanced tasks, outperforming several baselines and demonstrating the approach’s potential for reliable, scalable LLM-driven UAV control. The framework also highlights the importance of semantic reasoning in evaluation and provides insights into evaluator configuration, iteration tradeoffs, and cross-domain applicability.

Abstract

Recent advances in large Language Models (LLMs) have revolutionized mobile robots, including unmanned aerial vehicles (UAVs), enabling their intelligent operation within Internet of Things (IoT) ecosystems. However, LLMs still face challenges from logical reasoning and complex decision-making, leading to concerns about the reliability of LLM-driven UAV operations in IoT applications. In this paper, we propose a closed-loop LLM-driven UAV operation code generation framework that enables reliable UAV operations powered by effective feedback and refinement using two LLM modules, i.e., a Code Generator and an Evaluator. Our framework transforms numerical state observations from UAV operations into semantic trajectory descriptions to enhance the evaluator LLM's understanding of UAV dynamics for precise feedback generation. Our framework also enables a simulation-based refinement process, and hence eliminates the risks to physical UAVs caused by incorrect code execution during the refinement. Extensive experiments on UAV control tasks with different complexities are conducted. The experimental results show that our framework can achieve reliable UAV operations using LLMs, which significantly outperforms baseline methods in terms of success rate and completeness with the increase of task complexity.

Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations

TL;DR

This work addresses the reliability gap in LLM-driven UAV operation within IoT environments by introducing a closed-loop framework that uses two specialized LLMs—a Code Generator and an Evaluator—paired with simulation-based refinement and semantic trajectory observations. The Code Generator proposes UAV operation code, which is validated in simulation; numerical state observations are transformed into semantic trajectories to improve evaluator understanding and feedback quality. The Evaluator, guided by structured prompts and references, supplies corrective feedback that iteratively refines the code until task objectives are met, mitigating risks of unsafe exploration in the real world. Empirical results show strong performance, with 100% SR and 100% completeness on Basic tasks and 85% SR with 98.5% completeness on Advanced tasks, outperforming several baselines and demonstrating the approach’s potential for reliable, scalable LLM-driven UAV control. The framework also highlights the importance of semantic reasoning in evaluation and provides insights into evaluator configuration, iteration tradeoffs, and cross-domain applicability.

Abstract

Recent advances in large Language Models (LLMs) have revolutionized mobile robots, including unmanned aerial vehicles (UAVs), enabling their intelligent operation within Internet of Things (IoT) ecosystems. However, LLMs still face challenges from logical reasoning and complex decision-making, leading to concerns about the reliability of LLM-driven UAV operations in IoT applications. In this paper, we propose a closed-loop LLM-driven UAV operation code generation framework that enables reliable UAV operations powered by effective feedback and refinement using two LLM modules, i.e., a Code Generator and an Evaluator. Our framework transforms numerical state observations from UAV operations into semantic trajectory descriptions to enhance the evaluator LLM's understanding of UAV dynamics for precise feedback generation. Our framework also enables a simulation-based refinement process, and hence eliminates the risks to physical UAVs caused by incorrect code execution during the refinement. Extensive experiments on UAV control tasks with different complexities are conducted. The experimental results show that our framework can achieve reliable UAV operations using LLMs, which significantly outperforms baseline methods in terms of success rate and completeness with the increase of task complexity.

Paper Structure

This paper contains 23 sections, 4 equations, 10 figures, 6 tables, 2 algorithms.

Figures (10)

  • Figure 1: Pipelines of robotic operation code generation: (a) without LLMs and (b) with LLMs.
  • Figure 2: Illustration of LLM-driven closed-loop UAV Operation and Refinement.
  • Figure 3: Overall design of LLM-driven closed-loop UAV operation code generation with semantic observations. An example task of examining a square area is presented. In the first loop, the evaluator identifies deviations in the semantic trajectory observation (action 12), providing error feedback into the code generator to refine the code. In the second loop, the evaluator confirms refined code matches the task, and the code is deployed on the UAV for mission execution.
  • Figure 4: An example of transforming numerical state observations into semantic trajectory observation. UAV state vector of $[x, y, z, \theta]$ denotes (North, East, Down, Yaw).
  • Figure 5: The evaluation process begins by configuring the LLM agent (evaluator) via a system prompt framework that includes roles, rules, and references. Next, the semantic observations and the task description are provided to the evaluator for comparison.
  • ...and 5 more figures