An LLM-based Framework for Human-Swarm Teaming Cognition in Disaster Search and Rescue
Kailun Ji, Xiaoyu Hu, Xinyu Zhang, Jun Chen
TL;DR
The paper tackles the intention-to-action gap in disaster SAR by introducing the LLM-CRF, a cognitive reasoning framework that uses multi-modal operator input and an LLM-based engine to ground intent, decompose tasks, and plan UAV swarm actions. It couples Intent Grounding, In-Context Learning–driven swarm task planning, and Closed-Loop Verification to produce auditable, executable plans while keeping a human-in-the-loop for safety. In simulations, LLM-CRF outperforms manual and baseline LLM approaches in mission success, search coverage, and survivor detection, while significantly reducing operator workload. The work demonstrates a viable pathway toward intuitive, safe, and scalable human-swarm collaboration in high-stakes SAR scenarios and outlines directions to address sensor-noise and real-world deployment challenges.
Abstract
Large-scale disaster Search And Rescue (SAR) operations are persistently challenged by complex terrain and disrupted communications. While Unmanned Aerial Vehicle (UAV) swarms offer a promising solution for tasks like wide-area search and supply delivery, yet their effective coordination places a significant cognitive burden on human operators. The core human-machine collaboration bottleneck lies in the ``intention-to-action gap'', which is an error-prone process of translating a high-level rescue objective into a low-level swarm command under high intensity and pressure. To bridge this gap, this study proposes a novel LLM-CRF system that leverages Large Language Models (LLMs) to model and augment human-swarm teaming cognition. The proposed framework initially captures the operator's intention through natural and multi-modal interactions with the device via voice or graphical annotations. It then employs the LLM as a cognitive engine to perform intention comprehension, hierarchical task decomposition, and mission planning for the UAV swarm. This closed-loop framework enables the swarm to act as a proactive partner, providing active feedback in real-time while reducing the need for manual monitoring and control, which considerably advances the efficacy of the SAR task. We evaluate the proposed framework in a simulated SAR scenario. Experimental results demonstrate that, compared to traditional order and command-based interfaces, the proposed LLM-driven approach reduced task completion time by approximately $64.2\%$ and improved task success rate by $7\%$. It also leads to a considerable reduction in subjective cognitive workload, with NASA-TLX scores dropping by $42.9\%$. This work establishes the potential of LLMs to create more intuitive and effective human-swarm collaborations in high-stakes scenarios.
