Table of Contents
Fetching ...

Numerical Literals in Link Prediction: A Critical Examination of Models and Datasets

Moritz Blum, Basil Ell, Hannes Ill, Philipp Cimiano

TL;DR

This paper critiques numerical literals in link prediction by showing that many literal-aware models do not effectively exploit numerical information on standard benchmarks. It introduces a semi-synthetic dataset and ablation strategies to isolate the impact of literals from graph structure. The study finds that several models rely on extra parameters rather than literal information, while a specialized model family (KGA variants) can leverage literals in synthetic settings. The results emphasize the need for more thorough evaluation and the development of harder datasets to truly assess the value of numerical literals in knowledge-graph LP.

Abstract

Link Prediction(LP) is an essential task over Knowledge Graphs(KGs), traditionally focussed on using and predicting the relations between entities. Textual entity descriptions have already been shown to be valuable, but models that incorporate numerical literals have shown minor improvements on existing benchmark datasets. It is unclear whether a model is actually better in using numerical literals, or better capable of utilizing the graph structure. This raises doubts about the effectiveness of these methods and about the suitability of the existing benchmark datasets. We propose a methodology to evaluate LP models that incorporate numerical literals. We propose i) a new synthetic dataset to better understand how well these models use numerical literals and ii) dataset ablations strategies to investigate potential difficulties with the existing datasets. We identify a prevalent trend: many models underutilize literal information and potentially rely on additional parameters for performance gains. Our investigation highlights the need for more extensive evaluations when releasing new models and datasets.

Numerical Literals in Link Prediction: A Critical Examination of Models and Datasets

TL;DR

This paper critiques numerical literals in link prediction by showing that many literal-aware models do not effectively exploit numerical information on standard benchmarks. It introduces a semi-synthetic dataset and ablation strategies to isolate the impact of literals from graph structure. The study finds that several models rely on extra parameters rather than literal information, while a specialized model family (KGA variants) can leverage literals in synthetic settings. The results emphasize the need for more thorough evaluation and the development of harder datasets to truly assess the value of numerical literals in knowledge-graph LP.

Abstract

Link Prediction(LP) is an essential task over Knowledge Graphs(KGs), traditionally focussed on using and predicting the relations between entities. Textual entity descriptions have already been shown to be valuable, but models that incorporate numerical literals have shown minor improvements on existing benchmark datasets. It is unclear whether a model is actually better in using numerical literals, or better capable of utilizing the graph structure. This raises doubts about the effectiveness of these methods and about the suitability of the existing benchmark datasets. We propose a methodology to evaluate LP models that incorporate numerical literals. We propose i) a new synthetic dataset to better understand how well these models use numerical literals and ii) dataset ablations strategies to investigate potential difficulties with the existing datasets. We identify a prevalent trend: many models underutilize literal information and potentially rely on additional parameters for performance gains. Our investigation highlights the need for more extensive evaluations when releasing new models and datasets.
Paper Structure (33 sections, 4 equations, 5 figures, 6 tables)

This paper contains 33 sections, 4 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Example KG about the Eiffel Tower. The KG contains the entities Eiffel Tower, Tourist Attraction, and Observation Tower; and the literal value 324 meter: the height of the Eiffel Tower.
  • Figure 2: Example of the synthetic dataset enrichment. The entitites High-rise and Low-rise represent $c_{high}$ and $c_{low}$ and is a is used as the $r_{syn-a}$ relation. Ideally, an LP model predicts the tail entity High-rise for the given head Berliner Fernsehturm and the is a relation.
  • Figure 3: MRR scores over three runs for models and datasets that either include the original literal features or that include random literal features.
  • Figure 4: KGA$_{TuckER}$'s MRR scores (mean and variance over three runs) after removing x% relational triples from FB15k-237. The model is provided either with the original or with random numerical features. The variance is marginally small and not visually recognizable in the figure.
  • Figure 5: MRR score after removing some percentage of relational triples from FB15k-237. The models are provided either with the original numerical features or with random features. Mean score and variance are shown across three runs.