Table of Contents
Fetching ...

GPT4GEO: How a Language Model Sees the World's Geography

Jonathan Roberts, Timo Lüddecke, Sowmen Das, Kai Han, Samuel Albanie

TL;DR

The paper surveys GPT-4's geographic knowledge without external data feeds, using a staged suite of descriptive and application-centric experiments to profile factual recall and reasoning. It demonstrates strong performance on basic geographic facts (e.g., areas, heights, population estimates) and shows GPT-4 can synthesize information across diverse sources for route planning, navigation, and networks, while highlighting limitations in real-time data, exactness, and abstract optimization. The work underscores the model's potential for geospatial tooling and travel/navigation applications, but also cautions about hallucinations and prompts-driven variability, motivating further study into memorization versus reasoning and integration with live geographic data. Overall, GPT-4 exhibits substantial, though imperfect, geographic competence that can be harnessed for downstream applications with careful prompting and validation.

Abstract

Large language models (LLMs) have shown remarkable capabilities across a broad range of tasks involving question answering and the generation of coherent text and code. Comprehensively understanding the strengths and weaknesses of LLMs is beneficial for safety, downstream applications and improving performance. In this work, we investigate the degree to which GPT-4 has acquired factual geographic knowledge and is capable of using this knowledge for interpretative reasoning, which is especially important for applications that involve geographic data, such as geospatial analysis, supply chain management, and disaster response. To this end, we design and conduct a series of diverse experiments, starting from factual tasks such as location, distance and elevation estimation to more complex questions such as generating country outlines and travel networks, route finding under constraints and supply chain analysis. We provide a broad characterisation of what GPT-4 (without plugins or Internet access) knows about the world, highlighting both potentially surprising capabilities but also limitations.

GPT4GEO: How a Language Model Sees the World's Geography

TL;DR

The paper surveys GPT-4's geographic knowledge without external data feeds, using a staged suite of descriptive and application-centric experiments to profile factual recall and reasoning. It demonstrates strong performance on basic geographic facts (e.g., areas, heights, population estimates) and shows GPT-4 can synthesize information across diverse sources for route planning, navigation, and networks, while highlighting limitations in real-time data, exactness, and abstract optimization. The work underscores the model's potential for geospatial tooling and travel/navigation applications, but also cautions about hallucinations and prompts-driven variability, motivating further study into memorization versus reasoning and integration with live geographic data. Overall, GPT-4 exhibits substantial, though imperfect, geographic competence that can be harnessed for downstream applications with careful prompting and validation.

Abstract

Large language models (LLMs) have shown remarkable capabilities across a broad range of tasks involving question answering and the generation of coherent text and code. Comprehensively understanding the strengths and weaknesses of LLMs is beneficial for safety, downstream applications and improving performance. In this work, we investigate the degree to which GPT-4 has acquired factual geographic knowledge and is capable of using this knowledge for interpretative reasoning, which is especially important for applications that involve geographic data, such as geospatial analysis, supply chain management, and disaster response. To this end, we design and conduct a series of diverse experiments, starting from factual tasks such as location, distance and elevation estimation to more complex questions such as generating country outlines and travel networks, route finding under constraints and supply chain analysis. We provide a broad characterisation of what GPT-4 (without plugins or Internet access) knows about the world, highlighting both potentially surprising capabilities but also limitations.
Paper Structure (37 sections, 32 figures, 1 table)

This paper contains 37 sections, 32 figures, 1 table.

Figures (32)

  • Figure 1: GPT4GEO experiments taxonomy. We initially focus on factual tasks before moving towards application-centric tasks requiring logic and reasoning that build on this knowledge.
  • Figure 2: Socioeconomic indicators. A quantitative evaluation of GPT-4's understanding of country-level human populations and their impact on the environment, including $-$ (a) country populations, (b) life expectancies, and (c) CO2 emissions per capita. The red circles denote outliers.
  • Figure 3: Spatial features. An evaluation of GPT-4's understanding of country areas (a), heights of the 300 tallest mountains (b), and locations of settlements of different populations (c).
  • Figure 4: Topography. Predicted (lines) and actual elevations (shaded areas) along the trajectories depicted on the left (underlying region is in the Alps, brighter means more elevation).
  • Figure 5: Outlines for various geographic features produced using coordinates given by GPT-4. Refinement with additional prompts improves the outlines, see (a) after 1 and 6 iterations.
  • ...and 27 more figures