Do Sentence Transformers Learn Quasi-Geospatial Concepts from General Text?
Ilya Ilyankou, Aldo Lipani, Stefano Cavazzi, Xiaowei Gao, James Haworth
TL;DR
The paper addresses whether sentence transformers trained on general question-answering data can learn quasi-geospatial concepts from hiking-route descriptions. It builds a large, template-generated text corpus from OS Maps routes and uses embedding-based cosine similarity to match 20 hiking-related queries to route descriptions. Results show limited but present zero-shot signals (e.g., beginner or coastal versus urban patterns), with substantial variation across model architectures, indicating that pre-training and architecture shape geospatial understanding. The work highlights potential for geospatial text understanding in routing tasks and calls for more systematic, architecture-aware evaluation and richer textual representations of geospatial objects.
Abstract
Sentence transformers are language models designed to perform semantic search. This study investigates the capacity of sentence transformers, fine-tuned on general question-answering datasets for asymmetric semantic search, to associate descriptions of human-generated routes across Great Britain with queries often used to describe hiking experiences. We find that sentence transformers have some zero-shot capabilities to understand quasi-geospatial concepts, such as route types and difficulty, suggesting their potential utility for routing recommendation systems.
