CogExplore: Contextual Exploration with Language-Encoded Environment Representations

Harel Biggie; Patrick Cooper; Doncey Albin; Kristen Such; Christoffer Heckman

CogExplore: Contextual Exploration with Language-Encoded Environment Representations

Harel Biggie, Patrick Cooper, Doncey Albin, Kristen Such, Christoffer Heckman

TL;DR

The paper tackles exploring unknown environments by integrating large language models to ground semantic and temporal context into navigation. CogExplore encodes scene elements as natural language and uses a probabilistic, memory‑augmented prompting framework to select informative waypoints, balancing geometric reach with semantic cues. By combining a frontier‑based planner, an open vocabulary detection pipeline, and modular prompts, the approach achieves robust, temporally coherent exploration with explicit justification for decisions. In photorealistic Unreal Engine experiments across multiple environments, CogExplore delivers 100% success and shorter path lengths than baselines, demonstrating the practical impact of language‑grounded reasoning for search‑and‑rescue style tasks, while acknowledging simulation‑based limits and the need for real‑world validation.

Abstract

Integrating language models into robotic exploration frameworks improves performance in unmapped environments by providing the ability to reason over semantic groundings, contextual cues, and temporal states. The proposed method employs large language models (GPT-3.5 and Claude Haiku) to reason over these cues and express that reasoning in terms of natural language, which can be used to inform future states. We are motivated by the context of search-and-rescue applications where efficient exploration is critical. We find that by leveraging natural language, semantics, and tracking temporal states, the proposed method greatly reduces exploration path distance and further exposes the need for environment-dependent heuristics. Moreover, the method is highly robust to a variety of environments and noisy vision detections, as shown with a 100% success rate in a series of comprehensive experiments across three different environments conducted in a custom simulation pipeline operating in Unreal Engine.

CogExplore: Contextual Exploration with Language-Encoded Environment Representations

TL;DR

Abstract

Paper Structure (20 sections, 1 equation, 12 figures, 1 table)

This paper contains 20 sections, 1 equation, 12 figures, 1 table.

Introduction
Related Works
Robotic Exploration
LLM-Powered Robotic Navigation
CogExplore
Exploration Planner
Foundation Model Prompting Schemes
Experimental Results
Experimental Setup
Discussion
Conclusions and Limitations
Appendix
Overhead Path Views
Open Vocabulary Object Detection Pipeline
Prompts For Foundation Models
...and 5 more sections

Figures (12)

Figure 1: Spot characterizing its environment through its VQA model (Language Priors), searching for specific objects with its object detection model and creating projections (Object Points shown in red) surrounded by a set of navigable graph points (shown in purple).
Figure 2: Renderings from Unreal Engine Environments
Figure 3: CogExplore System Diagarm
Figure 4: Example Runs Demonstrating Varieties of Reasoning
Figure 5: Path length comparisons for each method (CE-3.5, CE-H, VEFEP) on completing each of the seven tasks. Black line for each whisker plot is the mean.
...and 7 more figures

CogExplore: Contextual Exploration with Language-Encoded Environment Representations

TL;DR

Abstract

CogExplore: Contextual Exploration with Language-Encoded Environment Representations

Authors

TL;DR

Abstract

Table of Contents

Figures (12)