Table of Contents
Fetching ...

GeoRDF2Vec Learning Location-Aware Entity Representations in Knowledge Graphs

Martin Boeckling, Heiko Paulheim, Sarah Detzler

TL;DR

The paper tackles the limitation that many knowledge graphs encode geographic information as literals or sparse spatial relations, hindering location-aware reasoning. It proposes GeoRDF2Vec, a two-stage method that (i) floods geographic geometries to non-geographic nodes and (ii) biases RDF2Vec walks with spatial proximity using geodesic distances and an exponential weight function, formalized with $d_{ij}=R\cdot \Delta \sigma$ and $w_{ij}=\exp(-d_{ij})$. Empirically, GeoRDF2Vec outperforms vanilla RDF2Vec and TransGeo on dmg777k and various DBpedia GEval tasks, confirming that location-aware embeddings improve downstream predictions while maintaining similar computational complexity. The work includes a flooding analysis and hyperparameter studies, demonstrates applicability to partially geographic KG, and provides a GitHub resource for reproducibility, signaling practical impact for geographic reasoning in large-scale KGs.

Abstract

Many knowledge graphs contain a substantial number of spatial entities, such as cities, buildings, and natural landmarks. For many of these entities, exact geometries are stored within the knowledge graphs. However, most existing approaches for learning entity representations do not take these geometries into account. In this paper, we introduce a variant of RDF2Vec that incorporates geometric information to learn location-aware embeddings of entities. Our approach expands different nodes by flooding the graph from geographic nodes, ensuring that each reachable node is considered. Based on the resulting flooded graph, we apply a modified version of RDF2Vec that biases graph walks using spatial weights. Through evaluations on multiple benchmark datasets, we demonstrate that our approach outperforms both non-location-aware RDF2Vec and GeoTransE.

GeoRDF2Vec Learning Location-Aware Entity Representations in Knowledge Graphs

TL;DR

The paper tackles the limitation that many knowledge graphs encode geographic information as literals or sparse spatial relations, hindering location-aware reasoning. It proposes GeoRDF2Vec, a two-stage method that (i) floods geographic geometries to non-geographic nodes and (ii) biases RDF2Vec walks with spatial proximity using geodesic distances and an exponential weight function, formalized with and . Empirically, GeoRDF2Vec outperforms vanilla RDF2Vec and TransGeo on dmg777k and various DBpedia GEval tasks, confirming that location-aware embeddings improve downstream predictions while maintaining similar computational complexity. The work includes a flooding analysis and hyperparameter studies, demonstrates applicability to partially geographic KG, and provides a GitHub resource for reproducibility, signaling practical impact for geographic reasoning in large-scale KGs.

Abstract

Many knowledge graphs contain a substantial number of spatial entities, such as cities, buildings, and natural landmarks. For many of these entities, exact geometries are stored within the knowledge graphs. However, most existing approaches for learning entity representations do not take these geometries into account. In this paper, we introduce a variant of RDF2Vec that incorporates geometric information to learn location-aware embeddings of entities. Our approach expands different nodes by flooding the graph from geographic nodes, ensuring that each reachable node is considered. Based on the resulting flooded graph, we apply a modified version of RDF2Vec that biases graph walks using spatial weights. Through evaluations on multiple benchmark datasets, we demonstrate that our approach outperforms both non-location-aware RDF2Vec and GeoTransE.

Paper Structure

This paper contains 16 sections, 8 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Example graph—excerpt from DBpedia
  • Figure 2: Different calculation of distances and resulting spatial weights for edges using the generated Erdős-Rényi graph
  • Figure 3: Influence of walk distance and number of walks of RDF2Vec variants