Selective Exploration and Information Gathering in Search and Rescue Using Hierarchical Learning Guided by Natural Language Input
Dimitrios Panagopoulos, Adolfo Perrusquia, Weisi Guo
TL;DR
The paper addresses the challenge of rapid and context-aware decision making in search-and-rescue (SAR) by marrying large language models (LLMs) with hierarchical reinforcement learning (HRL). It introduces a conceptual architecture with a Context Extractor, Information Space, Strategic Decision Engine, and Attention Space to convert verbal human input into actionable, multi-level policies, formalized within an extended MDP framework. By leveraging Retrieval-Augmented Generation (RAG) to inject domain knowledge, and evaluating in a simulated 2D SAR environment, the study shows that domain-informed LLMs and attention-guided HRL can improve learning efficiency, safety, and information gathering under sparse rewards. The findings suggest that human-in-the-loop, language-enabled planning can significantly enhance autonomous SAR performance, offering practical benefits for real-world disaster response while highlighting areas for scalability and robustness in continuous domains.
Abstract
In recent years, robots and autonomous systems have become increasingly integral to our daily lives, offering solutions to complex problems across various domains. Their application in search and rescue (SAR) operations, however, presents unique challenges. Comprehensively exploring the disaster-stricken area is often infeasible due to the vastness of the terrain, transformed environment, and the time constraints involved. Traditional robotic systems typically operate on predefined search patterns and lack the ability to incorporate and exploit ground truths provided by human stakeholders, which can be the key to speeding up the learning process and enhancing triage. Addressing this gap, we introduce a system that integrates social interaction via large language models (LLMs) with a hierarchical reinforcement learning (HRL) framework. The proposed system is designed to translate verbal inputs from human stakeholders into actionable RL insights and adjust its search strategy. By leveraging human-provided information through LLMs and structuring task execution through HRL, our approach not only bridges the gap between autonomous capabilities and human intelligence but also significantly improves the agent's learning efficiency and decision-making process in environments characterised by long horizons and sparse rewards.
