Reasoning about the Unseen for Efficient Outdoor Object Navigation
Quanting Xie, Tianyi Zhang, Kedi Xu, Matthew Johnson-Roberson, Yonatan Bisk
TL;DR
This work tackles outdoor object navigation with underspecified goals by introducing the OUTDOOR task and a reasoning-based planning framework. It proposes Reasoned Explorer, which couples an adaptive frontier graph with dual LLMs (LLM_Visionary and LLM_Evaluator) and a Rapidly-exploring Random Tree to simulate and evaluate future states, integrated with perception through a Vision-Language Model and real-robot control via a PID loop. A new Computationally Adjusted Success Rate (CASR) metric is defined to balance success against planning and travel time, enabling fair comparisons across compute budgets. Empirical results from AirSim simulations and real-world experiments on a drone and a quadruped demonstrate superior performance to baselines and demonstrate the viability of LLM-guided outdoor navigation without premapping, highlighting practical implications for robust, perception-aware robotics in open environments.
Abstract
Robots should exist anywhere humans do: indoors, outdoors, and even unmapped environments. In contrast, the focus of recent advancements in Object Goal Navigation(OGN) has targeted navigating in indoor environments by leveraging spatial and semantic cues that do not generalize outdoors. While these contributions provide valuable insights into indoor scenarios, the broader spectrum of real-world robotic applications often extends to outdoor settings. As we transition to the vast and complex terrains of outdoor environments, new challenges emerge. Unlike the structured layouts found indoors, outdoor environments lack clear spatial delineations and are riddled with inherent semantic ambiguities. Despite this, humans navigate with ease because we can reason about the unseen. We introduce a new task OUTDOOR, a new mechanism for Large Language Models (LLMs) to accurately hallucinate possible futures, and a new computationally aware success metric for pushing research forward in this more complex domain. Additionally, we show impressive results on both a simulated drone and physical quadruped in outdoor environments. Our agent has no premapping and our formalism outperforms naive LLM-based approaches
