UAVGENT: A Language-Guided Distributed Control Framework
Ziyi Zhang, Xiyu Deng, Guannan Qu, Yorie Nakahira
TL;DR
This work introduces UAVgent, a language-guided distributed control framework for multi-drone operations that couples a human-in-the-loop LM supervisor with a robust inner-loop controller operating on a radius-based communication graph. The outer layer translates natural-language goals into drone references, the middle layer auto-verifies and corrects commands at periodic checkpoints, and the inner layer preserves stability by driving edge-formation errors to zero under bounded disturbances. A formal exponential-input-to-state stability bound ties the LLM grounding accuracy, supervision cadence, and graph connectivity to a provable tracking error bound, ensuring robust performance during dynamic mission updates. The approach is demonstrated through police-chasing and forest-search-and-rescue simulations, showing dynamic reformation, target reassignment, and adaptive behavior with minimal human intervention, highlighting practical potential for complex, evolving multi-agent missions. Overall, UAVgent advances practical, robust, and interpretable language-guided coordination for swarms by tightly integrating high-level reasoning with distributed control guarantees.
Abstract
We study language-in-the-loop control for multi-drone systems that execute evolving, high-level missions while retaining formal robustness guarantees at the physical layer. We propose a three-layer architecture in which (i) a human operator issues natural-language instructions, (ii) an LLM-based supervisor periodically interprets, verifies, and corrects the commanded task in the context of the latest state and target estimates, and (iii) a distributed inner-loop controller tracks the resulting reference using only local relative information. We derive a theoretical guarantee that characterizes tracking performance under bounded disturbances and piecewise-smooth references with discrete jumps induced by LLM updates. Overall, our results illustrate how centralized language-based task reasoning can be combined with distributed feedback control to achieve complex behaviors with provable robustness and stability.
