Table of Contents
Fetching ...

WebNavigator: Global Web Navigation via Interaction Graph Retrieval

Xuanwang Zhang, Yuteng Han, Jinnan Qi, Mulong Xie, Zhen Wu, Xinyu Dai

Abstract

Despite significant advances in autonomous web navigation, current methods remain far from human-level performance in complex web environments. We argue that this limitation stems from Topological Blindness, where agents are forced to explore via trial-and-error without access to the global topological structure of the environment. To overcome this limitation, we introduce WebNavigator, which reframes web navigation from probabilistic exploration into deterministic retrieval and pathfinding. WebNavigator constructs Interaction Graphs via zero-token cost heuristic exploration offline and implements a Retrieve-Reason-Teleport workflow for global navigation online. WebNavigator achieves state-of-the-art performance on WebArena and OnlineMind2Web. On WebArena multi-site tasks, WebNavigator achieves a 72.9\% success rate, more than doubling the performance of enterprise-level agents. This work reveals that Topological Blindness, rather than model reasoning capabilities alone, is an underestimated bottleneck in autonomous web navigation.

WebNavigator: Global Web Navigation via Interaction Graph Retrieval

Abstract

Despite significant advances in autonomous web navigation, current methods remain far from human-level performance in complex web environments. We argue that this limitation stems from Topological Blindness, where agents are forced to explore via trial-and-error without access to the global topological structure of the environment. To overcome this limitation, we introduce WebNavigator, which reframes web navigation from probabilistic exploration into deterministic retrieval and pathfinding. WebNavigator constructs Interaction Graphs via zero-token cost heuristic exploration offline and implements a Retrieve-Reason-Teleport workflow for global navigation online. WebNavigator achieves state-of-the-art performance on WebArena and OnlineMind2Web. On WebArena multi-site tasks, WebNavigator achieves a 72.9\% success rate, more than doubling the performance of enterprise-level agents. This work reveals that Topological Blindness, rather than model reasoning capabilities alone, is an underestimated bottleneck in autonomous web navigation.
Paper Structure (22 sections, 7 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 7 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of WebNavigator. WebNavigator resolves Topological Blindness via a two-phase paradigm. (1) Offline Interaction Graph Construction. A heuristic auto-exploration engine discovers dynamic page observations at zero-token cost and indexes all observations into a vector database. (2) Online Retrieval-Augmented Navigation. The Global-View Navigator implements a three-stage workflow: Retrieve top-$k$ candidates from the Interaction Graph via multimodal retrieval; Reason to identify the optimal target page; and Teleport by computing and executing the shortest path within the Interaction Graph, achieving globally optimal navigation.
  • Figure 2: Trajectory comparison on a multi-site task (WebArena 760), which requires retrieving a specific customer address from the CMS to plan a route on the Map. WebNavigator achieves human-level planning via two navigate(domain, query) actions, whereas the ReAct baseline prematurely terminates due to Topological Blindness. The human expert trajectory is the gold standard.
  • Figure 3: Reddit
  • Figure 4: Map
  • Figure 5: CMS
  • ...and 2 more figures