GET: Goal-directed Exploration and Targeting for Large-Scale Unknown Environments
Lanxiang Zheng, Ruidong Mei, Mingxin Wei, Hao Ren, Hui Cheng
TL;DR
GET addresses object search in large-scale unknown environments by integrating LLM-based semantic reasoning with memory-guided exploration. It introduces Diagram of Unified Thought (DoUT) to provide real-time, feedback-driven decision refinement and uses a Gaussian Mixture Model (GMM) based Task Probability Map to continually update target-location priors. A Semantic Octomap and a Trajectory Refinement module complete the perception and navigation stack, enabling safe, efficient exploration. Real-world experiments show GET achieving substantial reductions in path length and search time across multiple scenes and LLMs, outperforming heuristic and LLM-only baselines and demonstrating scalable, embodied decision-making in complex environments.
Abstract
Object search in large-scale, unstructured environments remains a fundamental challenge in robotics, particularly in dynamic or expansive settings such as outdoor autonomous exploration. This task requires robust spatial reasoning and the ability to leverage prior experiences. While Large Language Models (LLMs) offer strong semantic capabilities, their application in embodied contexts is limited by a grounding gap in spatial reasoning and insufficient mechanisms for memory integration and decision consistency.To address these challenges, we propose GET (Goal-directed Exploration and Targeting), a framework that enhances object search by combining LLM-based reasoning with experience-guided exploration. At its core is DoUT (Diagram of Unified Thought), a reasoning module that facilitates real-time decision-making through a role-based feedback loop, integrating task-specific criteria and external memory. For repeated tasks, GET maintains a probabilistic task map based on a Gaussian Mixture Model, allowing for continual updates to object-location priors as environments evolve.Experiments conducted in real-world, large-scale environments demonstrate that GET improves search efficiency and robustness across multiple LLMs and task settings, significantly outperforming heuristic and LLM-only baselines. These results suggest that structured LLM integration provides a scalable and generalizable approach to embodied decision-making in complex environments.
