Table of Contents
Fetching ...

GeoResponder: Towards Building Geospatial LLMs for Time-Critical Disaster Response

Ahmed El Fekih Zguir, Ferda Ofli, Muhammad Imran

TL;DR

GeoResponder tackles the lack of geospatial reasoning in large language models by introducing a scaffolded, three-layer curriculum that grounds language in coordinates, enforces geometric/topological inference, and enables constraint-aware retrieval. The approach relies on OpenStreetMap-derived representations with atomic operations across grounding, reasoning, and retrieval to produce robust, multi-hop spatial solutions for disaster response. Across four diverse cities, GeoResponder consistently outperforms strong baselines and demonstrates strong generalization, including to out-of-distribution spatial queries. The work highlights the potential of internalized, structured geospatial priors as a low-latency complement or alternative to brittle external-tool workflows in time-critical scenarios.

Abstract

Large Language Models excel at linguistic tasks but lack the inner geospatial capabilities needed for time-critical disaster response, where reasoning about road networks, continuous coordinates, and access to essential infrastructure such as hospitals, shelters, and pharmacies is vital. We introduce GeoResponder, a framework that instills robust spatial reasoning through a scaffolded instruction-tuning curriculum. By stratifying geospatial learning into different cognitive layers, we effectively anchor semantic knowledge to the continuous coordinate manifold and enforce the internalization of spatial axioms. Extensive evaluations across four topologically distinct cities and diverse tasks demonstrate that GeoResponder significantly outperforms both state-of-the-art foundation models and domain-specific baselines. These results suggest that LLMs can begin to internalize and generalize geospatial structures, pointing toward the future development of language models capable of supporting disaster response needs.

GeoResponder: Towards Building Geospatial LLMs for Time-Critical Disaster Response

TL;DR

GeoResponder tackles the lack of geospatial reasoning in large language models by introducing a scaffolded, three-layer curriculum that grounds language in coordinates, enforces geometric/topological inference, and enables constraint-aware retrieval. The approach relies on OpenStreetMap-derived representations with atomic operations across grounding, reasoning, and retrieval to produce robust, multi-hop spatial solutions for disaster response. Across four diverse cities, GeoResponder consistently outperforms strong baselines and demonstrates strong generalization, including to out-of-distribution spatial queries. The work highlights the potential of internalized, structured geospatial priors as a low-latency complement or alternative to brittle external-tool workflows in time-critical scenarios.

Abstract

Large Language Models excel at linguistic tasks but lack the inner geospatial capabilities needed for time-critical disaster response, where reasoning about road networks, continuous coordinates, and access to essential infrastructure such as hospitals, shelters, and pharmacies is vital. We introduce GeoResponder, a framework that instills robust spatial reasoning through a scaffolded instruction-tuning curriculum. By stratifying geospatial learning into different cognitive layers, we effectively anchor semantic knowledge to the continuous coordinate manifold and enforce the internalization of spatial axioms. Extensive evaluations across four topologically distinct cities and diverse tasks demonstrate that GeoResponder significantly outperforms both state-of-the-art foundation models and domain-specific baselines. These results suggest that LLMs can begin to internalize and generalize geospatial structures, pointing toward the future development of language models capable of supporting disaster response needs.

Paper Structure

This paper contains 23 sections, 10 figures, 5 tables, 1 algorithm.

Figures (10)

  • Figure 1: Qualitative comparison of SOTA models with GeoResponder.
  • Figure 2: Visual Overview of the First 2 levels of the Cognitive Layers.
  • Figure 3: Out-of-distribution disaster tasks: stacked comparison between CityGPT performance (grey) and the absolute improvement achieved by GeoResponder–Mistral (yellow).
  • Figure 4: In-distribution non-MCQ bar distribution.
  • Figure 5: Ablation study showing the average performance drop when individual representations are removed. Bars indicate the reduction in the unified metric (average of all downstream tasks).
  • ...and 5 more figures