UAVGENT: A Language-Guided Distributed Control Framework

Ziyi Zhang; Xiyu Deng; Guannan Qu; Yorie Nakahira

UAVGENT: A Language-Guided Distributed Control Framework

Ziyi Zhang, Xiyu Deng, Guannan Qu, Yorie Nakahira

TL;DR

This work introduces UAVgent, a language-guided distributed control framework for multi-drone operations that couples a human-in-the-loop LM supervisor with a robust inner-loop controller operating on a radius-based communication graph. The outer layer translates natural-language goals into drone references, the middle layer auto-verifies and corrects commands at periodic checkpoints, and the inner layer preserves stability by driving edge-formation errors to zero under bounded disturbances. A formal exponential-input-to-state stability bound ties the LLM grounding accuracy, supervision cadence, and graph connectivity to a provable tracking error bound, ensuring robust performance during dynamic mission updates. The approach is demonstrated through police-chasing and forest-search-and-rescue simulations, showing dynamic reformation, target reassignment, and adaptive behavior with minimal human intervention, highlighting practical potential for complex, evolving multi-agent missions. Overall, UAVgent advances practical, robust, and interpretable language-guided coordination for swarms by tightly integrating high-level reasoning with distributed control guarantees.

Abstract

We study language-in-the-loop control for multi-drone systems that execute evolving, high-level missions while retaining formal robustness guarantees at the physical layer. We propose a three-layer architecture in which (i) a human operator issues natural-language instructions, (ii) an LLM-based supervisor periodically interprets, verifies, and corrects the commanded task in the context of the latest state and target estimates, and (iii) a distributed inner-loop controller tracks the resulting reference using only local relative information. We derive a theoretical guarantee that characterizes tracking performance under bounded disturbances and piecewise-smooth references with discrete jumps induced by LLM updates. Overall, our results illustrate how centralized language-based task reasoning can be combined with distributed feedback control to achieve complex behaviors with provable robustness and stability.

UAVGENT: A Language-Guided Distributed Control Framework

TL;DR

Abstract

Paper Structure (47 sections, 3 theorems, 77 equations, 8 figures)

This paper contains 47 sections, 3 theorems, 77 equations, 8 figures.

Introduction
Contribution
Related Works
Classical Multi-agent Control.
LLM-guided Swarm Control.
Algorithm Design
Notation.
Radius-induced interaction graph.
Outer layer: user instruction via LLM
Middle layer: LLM supervision
Reference interpolation.
Tracking re-grounding for moving targets.
Range-limited target search.
Inner layer: distributed formation tracking on a radius-induced graph
Preliminaries
...and 32 more sections

Key Result

Theorem 1

Fix $k$ and consider an interval $[t_k,t_{k+1})$ where assumption:general holds. Further assume the corresponding graph is connected (or restrict to a connected component). Let $e(t)$ satisfy the closed-loop edge-error dynamics eq:error_dyn with additive input $w(t)$ given by eq:w_def. Let $\lambda_ Then, for any $t\in[t_k,t_{k+1})$, In particular, if $\|w(t)\|\le \bar{w}$ on $[t_k,t_{k+1})$ then

Figures (8)

Figure 1: UAVgent hierarchical structure
Figure 2: In \ref{['fig:case1a']}, 24 drones are initially scattered around the target vehicles. The user commands the drones to form a grid and track the vehicles (\ref{['fig:case1b']}). As the vehicles diverge, the inner-layer controller splits the swarm to track each target, temporarily degrading the formation (\ref{['fig:case1c']}). The LLM supervisor then recenters and rebalances the sub-swarms (\ref{['fig:case1g']}). Finally, the user requests reformation into a circle, square, and cross (\ref{['fig:case1j']}). Additional details are shown in \ref{['fig:app_figure']}.
Figure 3: In \ref{['fig:case2a']}, we show a mountain search-and-rescue scenario with a rescue vehicle (red circle) and a missing person (green circle). After the user commands the drones to explore, they move to assigned waypoints while searching the region (\ref{['fig:case2b']}). Upon detection, the drones form an encirclement around the target (\ref{['fig:case2c']}).
Figure 4: In \ref{['fig:app_case1_1a']}, we have 24 drones scattered around the target cars. The user then commands the drones to form a grid around the cars with LLM, as shown in \ref{['fig:app_case1_1b']}. As the cars move toward the road intersection, they diverge in different directions. The inner-layer control algorithm has the group of drones diverted to track different cars, as shown in \ref{['fig:app_case1_1c']}. However, the low-level control heuristic does not ensure the drones maintain the formation exactly during splitting. Lastly, the LLM auto-verification recenters each drone on the target and rebalances the group for each target, as shown in \ref{['fig:app_case1_1g']}, \ref{['fig:app_case1_1h']}, and \ref{['fig:app_case1_1i']}. The user then issues additional commands to reform the groups into a circle, square, and cross, as shown in \ref{['fig:app_case1_1j']}, \ref{['fig:app_case1_1k']}, and \ref{['fig:app_case1_1l']}.
Figure 5: In \ref{['fig:app_case1_2a']}, we have 24 drones scattered around the target cars. The user then commands the drones to form a circle around the cars with LLM, as shown in \ref{['fig:app_case1_2b']}. As the cars move toward the road intersection, they diverge to different directions. The inner-layer control algorithm has the group of drones diverted to track different cars, as shown in \ref{['fig:app_case1_2c']}. However, the low-level control heuristic does not organize the car to maintain the formation exactly during the splitting process. Lastly, the LLM auto-verification recenters each drone on the target and rebalances the group for each target, as shown in \ref{['fig:app_case1_2d']}, \ref{['fig:app_case1_2e']}, and \ref{['fig:app_case1_2f']}. The user then issues additional commands to not tracking, as shown in \ref{['fig:app_case1_2j']}, \ref{['fig:app_case1_2k']}, and \ref{['fig:app_case1_2l']}.
...and 3 more figures

Theorems & Definitions (7)

Theorem 1: Exponential ISS of edge-formation error
Corollary 1: LLM relative accuracy for edge-level tolerance
Remark 1: Feasibility and interpretation
proof : Proof of \ref{['thm:ISS']}
proof
Theorem 2: Horizon-wide relative tracking bound under stochastic LLM verification
proof

UAVGENT: A Language-Guided Distributed Control Framework

TL;DR

Abstract

UAVGENT: A Language-Guided Distributed Control Framework

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (7)