REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

Shuqi Xiao; Maani Ghaffari; Chengzhong Xu; Hui Kong

REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

Shuqi Xiao, Maani Ghaffari, Chengzhong Xu, Hui Kong

Abstract

Zero-shot object-goal navigation (ZSON) requires navigating unknown environments to find a target object without task-specific training. Prior hierarchical training-free solutions invest in scene understanding (\textit{belief}) and high-level decision-making (\textit{policy}), yet overlook the design of \textit{option}, i.e., a subgoal candidate proposed from evolving belief and presented to policy for selection. In practice, options are reduced to isolated waypoints scored independently: single destinations hide the value gathered along the journey; an unstructured collection obscures the relationships among candidates. Our insight is that the option space should be a \textit{tree of paths}. Full paths expose en-route information gain that destination-only scoring systematically neglects; a tree of shared segments enables coarse-to-fine LLM reasoning that dismisses or pursues entire branches before examining individual leaves, compressing the combinatorial path space into an efficient hierarchy. We instantiate this insight in \textbf{REST} (Receding Horizon Explorative Steiner Tree), a training-free framework that (1) builds an explicit open-vocabulary 3D map from online RGB-D streams; (2) grows an agent-centric tree of safe and informative paths as the option space via sampling-based planning; and (3) textualizes each branch into a spatial narrative and selects the next-best path through chain-of-thought LLM reasoning. Across the Gibson, HM3D, and HSSD benchmarks, REST consistently ranks among the top methods in success rate while achieving the best or second-best path efficiency, demonstrating a favorable efficiency-success balance.

REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

Abstract

Paper Structure (35 sections, 1 equation, 3 figures, 2 tables, 1 algorithm)

This paper contains 35 sections, 1 equation, 3 figures, 2 tables, 1 algorithm.

Introduction
Related Work
Foundation models for ObjectNav
Autonomous exploration for ObjectNav
Methodology
Belief Update
Geometric mapping
Semantic mapping
2D perception
3D mapping
Road Mapping
Option Space
Informative viewpoint sampler
Spatial thinning
Information-gain gating
...and 20 more sections

Figures (3)

Figure 1: REST reasons over an agent-centric tree of safe and informative paths rather than evaluating isolated waypoints. Here, REST selects the next-best subtree among four options via spatial narratives; a conventional agent (e.g., VLFM yokoyamaVLFMVisionLanguageFrontier2024) independently scores ten waypoints by semantic similarity and geometric proximity, discarding spatial-temporal context.
Figure 2: Overview of REST, a training-free ObjectNav framework that replans in a receding-horizon manner. At each decision cycle, the agent updates the from online RGB-D streams, grows an agent-centric Steiner tree of safe and informative paths as the option space, textualizes each branch into a spatial narrative, and selects the next-best path through chain-of-thought LLM reasoning.
Figure 3: The RT-RRT* subtree connecting current agent (indexed by 0) to all informative viewpoints indexed from 1 to 15 (left) versus the optimized Steiner tree (right). Independent per-path optimization produces redundant edges, while the OAESMT formulation merges shared segments and surfaces decision junctions, reducing total path length from $85 \mathrm{m}$ to $47 \mathrm{m}$.

REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

Abstract

REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

Authors

Abstract

Table of Contents

Figures (3)