Table of Contents
Fetching ...

MapQA: Open-domain Geospatial Question Answering on Map Data

Zekun Li, Malcolm Grossman, Eric, Qasemi, Mihir Kulkarni, Muhao Chen, Yao-Yi Chiang

TL;DR

MapQA tackles geospatial question answering by requiring grounding in map geometries and spatial relations. It introduces a geometry-aware benchmark derived from OpenStreetMap across Southern California and Illinois, totaling 3,154 QA pairs with nine reasoning templates. The study benchmarks retrieval-based DPR (with BERT/GeoLM) and LLM-based text-to-SQL (GPT, Gemini, etc.), finding that retrieval methods capture proximity and direction but struggle with precise distances, while LLMs excel at one-hop SQL but have difficulty with multi-hop spatial reasoning. The work provides a scalable data-generation approach and a baseline suite, paving the way for more robust geospatial QA systems and GIS-enabled applications; future work will extend topological relations and regional coverage.

Abstract

Geospatial question answering (QA) is a fundamental task in navigation and point of interest (POI) searches. While existing geospatial QA datasets exist, they are limited in both scale and diversity, often relying solely on textual descriptions of geo-entities without considering their geometries. A major challenge in scaling geospatial QA datasets for reasoning lies in the complexity of geospatial relationships, which require integrating spatial structures, topological dependencies, and multi-hop reasoning capabilities that most text-based QA datasets lack. To address these limitations, we introduce MapQA, a novel dataset that not only provides question-answer pairs but also includes the geometries of geo-entities referenced in the questions. MapQA is constructed using SQL query templates to extract question-answer pairs from OpenStreetMap (OSM) for two study regions: Southern California and Illinois. It consists of 3,154 QA pairs spanning nine question types that require geospatial reasoning, such as neighborhood inference and geo-entity type identification. Compared to existing datasets, MapQA expands both the number and diversity of geospatial question types. We explore two approaches to tackle this challenge: (1) a retrieval-based language model that ranks candidate geo-entities by embedding similarity, and (2) a large language model (LLM) that generates SQL queries from natural language questions and geo-entity attributes, which are then executed against an OSM database. Our findings indicate that retrieval-based methods effectively capture concepts like closeness and direction but struggle with questions that require explicit computations (e.g., distance calculations). LLMs (e.g., GPT and Gemini) excel at generating SQL queries for one-hop reasoning but face challenges with multi-hop reasoning, highlighting a key bottleneck in advancing geospatial QA systems.

MapQA: Open-domain Geospatial Question Answering on Map Data

TL;DR

MapQA tackles geospatial question answering by requiring grounding in map geometries and spatial relations. It introduces a geometry-aware benchmark derived from OpenStreetMap across Southern California and Illinois, totaling 3,154 QA pairs with nine reasoning templates. The study benchmarks retrieval-based DPR (with BERT/GeoLM) and LLM-based text-to-SQL (GPT, Gemini, etc.), finding that retrieval methods capture proximity and direction but struggle with precise distances, while LLMs excel at one-hop SQL but have difficulty with multi-hop spatial reasoning. The work provides a scalable data-generation approach and a baseline suite, paving the way for more robust geospatial QA systems and GIS-enabled applications; future work will extend topological relations and regional coverage.

Abstract

Geospatial question answering (QA) is a fundamental task in navigation and point of interest (POI) searches. While existing geospatial QA datasets exist, they are limited in both scale and diversity, often relying solely on textual descriptions of geo-entities without considering their geometries. A major challenge in scaling geospatial QA datasets for reasoning lies in the complexity of geospatial relationships, which require integrating spatial structures, topological dependencies, and multi-hop reasoning capabilities that most text-based QA datasets lack. To address these limitations, we introduce MapQA, a novel dataset that not only provides question-answer pairs but also includes the geometries of geo-entities referenced in the questions. MapQA is constructed using SQL query templates to extract question-answer pairs from OpenStreetMap (OSM) for two study regions: Southern California and Illinois. It consists of 3,154 QA pairs spanning nine question types that require geospatial reasoning, such as neighborhood inference and geo-entity type identification. Compared to existing datasets, MapQA expands both the number and diversity of geospatial question types. We explore two approaches to tackle this challenge: (1) a retrieval-based language model that ranks candidate geo-entities by embedding similarity, and (2) a large language model (LLM) that generates SQL queries from natural language questions and geo-entity attributes, which are then executed against an OSM database. Our findings indicate that retrieval-based methods effectively capture concepts like closeness and direction but struggle with questions that require explicit computations (e.g., distance calculations). LLMs (e.g., GPT and Gemini) excel at generating SQL queries for one-hop reasoning but face challenges with multi-hop reasoning, highlighting a key bottleneck in advancing geospatial QA systems.

Paper Structure

This paper contains 16 sections, 4 figures, 7 tables.

Figures (4)

  • Figure 1: The figure illustrates the two primary approaches we employed to tackle geospatial question answering (QA) problems. In the retrieval-based approach (e.g.,Dense Passage Retrieval karpukhin-etal-2020-dense), we construct the candidates by gathering all the entities in the geospatial database. QA is conducted by evaluating the similarity between the embeddings of the question and the candidate entities. In the large language model (LLM) text-to-SQL approach, the LLM generates SQL queries based on the given question and its context, where the context specifies attributes (e.g., OSM ID) of the query geo-entity. The QA process is conducted by running these SQL queries against the OpenStreetMap (OSM) database.
  • Figure 2: The figure illustrates the distribution of geo-entities in the questions for two study regions. Orange dots represent geo-entities from the training set, while blue dots denote those in the test set. Since Illinois serves as a zero-shot test set, it contains no training geo-entities (i.e., no orange dots). The SouthCal split includes only geo-entities from Southern California, rather than the entire state, to optimize computational efficiency. However, the same methodology is directly applicable to Northern California or other regions. The geo-entity distribution closely follows county-level population patterns derived from US Census data, indicating that the sampling strategy effectively captures real-world user interests and reflects typical spatial distributions in practical applications.
  • Figure 3: The figure shows the amenity type distribution in the answers for SouthCal and Illinois study region, with SouthCal featuring 73 distinct amenity types and Illinois consisting of 29. During the training of retrieval-based models, a total of 175 amenity types were used as candidate options. Due to space constraints, the pie chart does not include all amenity types in its labels. For a comprehensive list of the amenity types referenced in the answers, please refer to \ref{['ap:amenity type']}.
  • Figure 4: The percentage of malformed SQL scripts (e.g., those containing syntax errors) generated by each LLM in the Text-to-SQL experiments.