Understanding the Geospatial Reasoning Capabilities of LLMs: A Trajectory Recovery Perspective
Thinh Hung Truong, Jey Han Lau, Jianzhong Qi
TL;DR
Understanding the Geospatial Reasoning Capabilities of LLMs: A Trajectory Recovery Perspective addresses whether general-purpose LLMs can read road networks and perform navigation without external routing tools. It introduces GlobalTrace, a dataset of over $4{,}000$ real-world trajectories, and a two-stage prompting framework (Stage 1: Path Selection; Stage 2: Coordinate Generation) to reconstruct long masked GPS segments from road-network context. Experiments show state-of-the-art LLMs achieve strong zero-shot trajectory recovery, often rivaling Google Maps, with GPT-4.1 leading among LLMs and clear generalization across regions and activity types, though biases persist (regional and activity-type). The work also demonstrates how LLMs can incorporate user preferences to produce flexible, preference-aware routes, suggesting significant potential for mobility and urban planning applications, while highlighting the need to address biases and privacy concerns.
Abstract
We explore the geospatial reasoning capabilities of Large Language Models (LLMs), specifically, whether LLMs can read road network maps and perform navigation. We frame trajectory recovery as a proxy task, which requires models to reconstruct masked GPS traces, and introduce GLOBALTRACE, a dataset with over 4,000 real-world trajectories across diverse regions and transportation modes. Using road network as context, our prompting framework enables LLMs to generate valid paths without accessing any external navigation tools. Experiments show that LLMs outperform off-the-shelf baselines and specialized trajectory recovery models, with strong zero-shot generalization. Fine-grained analysis shows that LLMs have strong comprehension of the road network and coordinate systems, but also pose systematic biases with respect to regions and transportation modes. Finally, we demonstrate how LLMs can enhance navigation experiences by reasoning over maps in flexible ways to incorporate user preferences.
