TypeFly: Flying Drones with Large Language Model
Guojun Chen, Xiaojing Yu, Neiwen Ling, Lin Zhong
TL;DR
TypeFly tackles the latency bottleneck of LLM-driven drone control by introducing MiniSpec, a token-efficient domain-specific language, and a stream-based runtime that interprets and executes plans as they are generated. The system couples an on-prem edge server, a vision module, and a cloud LLM with a prompt generator, enabling low-latency control through streaming execution, a probe mechanism for runtime reasoning, and an exception-handling facility (replan) for dynamic environments. Across 11 benchmark tasks, TypeFly demonstrates up to 62% reduction in response time and substantial token savings, with robust performance aided by MiniSpec, probe, and replan; however, limitations remain in geometric reasoning and memory of past scenes. The work highlights practical advances toward real-time, privacy-preserving, LLM-assisted drone control and suggests directions such as memory-enabled scene modeling and prompt-caching to further reduce latency and improve reliability.
Abstract
Recent advancements in robot control using large language models (LLMs) have demonstrated significant potential, primarily due to LLMs' capabilities to understand natural language commands and generate executable plans in various languages. However, in real-time and interactive applications involving mobile robots, particularly drones, the sequential token generation process inherent to LLMs introduces substantial latency, i.e. response time, in control plan generation. In this paper, we present a system called ChatFly that tackles this problem using a combination of a novel programming language called MiniSpec and its runtime to reduce the plan generation time and drone response time. That is, instead of asking an LLM to write a program (robotic plan) in the popular but verbose Python, ChatFly gets it to do it in MiniSpec specially designed for token efficiency and stream interpretation. Using a set of challenging drone tasks, we show that design choices made by ChatFly can reduce up to 62% response time and provide a more consistent user experience, enabling responsive and intelligent LLM-based drone control with efficient completion.
