Table of Contents
Fetching ...

Textual understanding boost in the WikiRace

Raman Ebrahimi, Sean Fuhrman, Kendrick Nguyen, Harini Gurusankar, Massimo Franceschetti

TL;DR

This paper casts WikiRace as a goal-directed navigation problem on a directed graph $G=(V,E)$, where an agent iteratively moves from a start node to a target by selecting a neighbor from $N^+(v)$. It systematically compares graph-theoretic centrality, semantic/embedding-based navigation, and hybrid strategies on a pruned subgraph of approximately $10^5$ nodes, highlighting the power of semantic signals encoded by embeddings such as all-MiniLM-L6-v2. The main finding is that a purely greedy semantic navigator, augmented with a simple loop-avoidance mechanism, achieves near-perfect success with an order of magnitude improvement over structural or hybrid approaches, challenging the necessity of exploration in this setting. The work underscores the potential of large language model-based navigators as zero-shot agents in complex information spaces and provides a robust pipeline and benchmark for future research in goal-directed graph navigation.

Abstract

The WikiRace game, where players navigate between Wikipedia articles using only hyperlinks, serves as a compelling benchmark for goal-directed search in complex information networks. This paper presents a systematic evaluation of navigation strategies for this task, comparing agents guided by graph-theoretic structure (betweenness centrality), semantic meaning (language model embeddings), and hybrid approaches. Through rigorous benchmarking on a large Wikipedia subgraph, we demonstrate that a purely greedy agent guided by the semantic similarity of article titles is overwhelmingly effective. This strategy, when combined with a simple loop-avoidance mechanism, achieved a perfect success rate and navigated the network with an efficiency an order of magnitude better than structural or hybrid methods. Our findings highlight the critical limitations of purely structural heuristics for goal-directed search and underscore the transformative potential of large language models to act as powerful, zero-shot semantic navigators in complex information spaces.

Textual understanding boost in the WikiRace

TL;DR

This paper casts WikiRace as a goal-directed navigation problem on a directed graph , where an agent iteratively moves from a start node to a target by selecting a neighbor from . It systematically compares graph-theoretic centrality, semantic/embedding-based navigation, and hybrid strategies on a pruned subgraph of approximately nodes, highlighting the power of semantic signals encoded by embeddings such as all-MiniLM-L6-v2. The main finding is that a purely greedy semantic navigator, augmented with a simple loop-avoidance mechanism, achieves near-perfect success with an order of magnitude improvement over structural or hybrid approaches, challenging the necessity of exploration in this setting. The work underscores the potential of large language model-based navigators as zero-shot agents in complex information spaces and provides a robust pipeline and benchmark for future research in goal-directed graph navigation.

Abstract

The WikiRace game, where players navigate between Wikipedia articles using only hyperlinks, serves as a compelling benchmark for goal-directed search in complex information networks. This paper presents a systematic evaluation of navigation strategies for this task, comparing agents guided by graph-theoretic structure (betweenness centrality), semantic meaning (language model embeddings), and hybrid approaches. Through rigorous benchmarking on a large Wikipedia subgraph, we demonstrate that a purely greedy agent guided by the semantic similarity of article titles is overwhelmingly effective. This strategy, when combined with a simple loop-avoidance mechanism, achieved a perfect success rate and navigated the network with an efficiency an order of magnitude better than structural or hybrid methods. Our findings highlight the critical limitations of purely structural heuristics for goal-directed search and underscore the transformative potential of large language models to act as powerful, zero-shot semantic navigators in complex information spaces.

Paper Structure

This paper contains 13 sections, 2 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: An instance of WikiRace on a small 1000 nodes subgraph starting from the node Aristotle (green) with the target node Cold War (yellow). Displayed strategies are random (red), $LLM^*$ (yellow), and betweenness (green). The optimal path is shown by the black links.