Agentic UAVs: LLM-Driven Autonomy with Integrated Tool-Calling and Cognitive Reasoning
Anis Koubaa, Khaled Gabr
TL;DR
The paper tackles the gap between rule-based UAV control and general-purpose, context-aware autonomy by introducing Agentic UAVs, a five-layer architecture that fuses LLM-driven reasoning with continuous perception, tool-calling, and ecosystem integration. It further demonstrates collaboration across a swarm via standardized protocols and evaluates the approach in high-fidelity SAR simulations, showing improved detection, contextual understanding, and autonomous decision-making, albeit with higher processing overhead that can be mitigated by hybrid local–cloud configurations. The results suggest that UAVs can function as ecosystem-aware cognitive agents capable of distributed problem solving, not just isolated planners. This work advances practical pathways toward general-purpose aerial agents with real-time knowledge access and multi-agent collaboration, bridging perception, reasoning, and action within an integrated digital ecosystem.
Abstract
Unmanned Aerial Vehicles (UAVs) are increasingly used in defense, surveillance, and disaster response, yet most systems still operate at SAE Level 2 to 3 autonomy. Their dependence on rule-based control and narrow AI limits adaptability in dynamic and uncertain missions. Current UAV architectures lack context-aware reasoning, autonomous decision-making, and integration with external systems. Importantly, none make use of Large Language Model (LLM) agents with tool-calling for real-time knowledge access. This paper introduces the Agentic UAVs framework, a five-layer architecture consisting of Perception, Reasoning, Action, Integration, and Learning. The framework enhances UAV autonomy through LLM-driven reasoning, database querying, and interaction with third-party systems. A prototype built with ROS 2 and Gazebo combines YOLOv11 for object detection with GPT-4 for reasoning and a locally deployed Gemma 3 model. In simulated search-and-rescue scenarios, agentic UAVs achieved higher detection confidence (0.79 compared to 0.72), improved person detection rates (91% compared to 75%), and a major increase in correct action recommendations (92% compared to 4.5%). These results show that modest computational overhead can enable significantly higher levels of autonomy and system-level integration.
