Table of Contents
Fetching ...

Domain and Range Aware Synthetic Negatives Generation for Knowledge Graph Embedding Models

Alberto Bernardi, Luca Costabello

TL;DR

A strategy that generates corruptions during training respecting the domain and range of relations is revamped, its capabilities are extended and it is shown the methods bring substantial improvement for standard benchmark datasets.

Abstract

Knowledge Graph Embedding models, representing entities and edges in a low-dimensional space, have been extremely successful at solving tasks related to completing and exploring Knowledge Graphs (KGs). One of the key aspects of training most of these models is teaching to discriminate between true statements positives and false ones (negatives). However, the way in which negatives can be defined is not trivial, as facts missing from the KG are not necessarily false and a set of ground truth negatives is hardly ever given. This makes synthetic negative generation a necessity. Different generation strategies can heavily affect the quality of the embeddings, making it a primary aspect to consider. We revamp a strategy that generates corruptions during training respecting the domain and range of relations, we extend its capabilities and we show our methods bring substantial improvement (+10% MRR) for standard benchmark datasets and over +150% MRR for a larger ontology-backed dataset.

Domain and Range Aware Synthetic Negatives Generation for Knowledge Graph Embedding Models

TL;DR

A strategy that generates corruptions during training respecting the domain and range of relations is revamped, its capabilities are extended and it is shown the methods bring substantial improvement for standard benchmark datasets.

Abstract

Knowledge Graph Embedding models, representing entities and edges in a low-dimensional space, have been extremely successful at solving tasks related to completing and exploring Knowledge Graphs (KGs). One of the key aspects of training most of these models is teaching to discriminate between true statements positives and false ones (negatives). However, the way in which negatives can be defined is not trivial, as facts missing from the KG are not necessarily false and a set of ground truth negatives is hardly ever given. This makes synthetic negative generation a necessity. Different generation strategies can heavily affect the quality of the embeddings, making it a primary aspect to consider. We revamp a strategy that generates corruptions during training respecting the domain and range of relations, we extend its capabilities and we show our methods bring substantial improvement (+10% MRR) for standard benchmark datasets and over +150% MRR for a larger ontology-backed dataset.

Paper Structure

This paper contains 21 sections, 1 figure, 5 tables, 1 algorithm.

Figures (1)

  • Figure 1: Performance of the different models varying the proportion of ontology-based negatives $\nu$.