Table of Contents
Fetching ...

Negative Sampling in Knowledge Graph Representation Learning: A Review

Tiroshan Madushanka, Ryutaro Ichise

TL;DR

This comprehensive survey paper systematically reviews various negative sampling methods and their contributions to the success of KGRL, offering a generalization and alignment of fundamental NS concepts that provide valuable insights for designing effective NS methods in the context of KGRL.

Abstract

Knowledge Graph Representation Learning (KGRL), or Knowledge Graph Embedding (KGE), is essential for AI applications such as knowledge construction and information retrieval. These models encode entities and relations into lower-dimensional vectors, supporting tasks like link prediction and recommendation systems. Training KGE models relies on both positive and negative samples for effective learning, but generating high-quality negative samples from existing knowledge graphs is challenging. The quality of these samples significantly impacts the model's accuracy. This comprehensive survey paper systematically reviews various negative sampling (NS) methods and their contributions to the success of KGRL. Their respective advantages and disadvantages are outlined by categorizing existing NS methods into six distinct categories. Moreover, this survey identifies open research questions that serve as potential directions for future investigations. By offering a generalization and alignment of fundamental NS concepts, this survey provides valuable insights for designing effective NS methods in the context of KGRL and serves as a motivating force for further advancements in the field.

Negative Sampling in Knowledge Graph Representation Learning: A Review

TL;DR

This comprehensive survey paper systematically reviews various negative sampling methods and their contributions to the success of KGRL, offering a generalization and alignment of fundamental NS concepts that provide valuable insights for designing effective NS methods in the context of KGRL.

Abstract

Knowledge Graph Representation Learning (KGRL), or Knowledge Graph Embedding (KGE), is essential for AI applications such as knowledge construction and information retrieval. These models encode entities and relations into lower-dimensional vectors, supporting tasks like link prediction and recommendation systems. Training KGE models relies on both positive and negative samples for effective learning, but generating high-quality negative samples from existing knowledge graphs is challenging. The quality of these samples significantly impacts the model's accuracy. This comprehensive survey paper systematically reviews various negative sampling (NS) methods and their contributions to the success of KGRL. Their respective advantages and disadvantages are outlined by categorizing existing NS methods into six distinct categories. Moreover, this survey identifies open research questions that serve as potential directions for future investigations. By offering a generalization and alignment of fundamental NS concepts, this survey provides valuable insights for designing effective NS methods in the context of KGRL and serves as a motivating force for further advancements in the field.
Paper Structure (46 sections, 14 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 46 sections, 14 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: PRISMA method used for selecting the articles in this survey.
  • Figure 2: The Knowledge Graph Representation Learning Framework aims to train a knowledge graph embedding model by optimizing the scoring function $f(x)$. The objective is to maximize the scores of positive samples $x=(h, r, t)$ while minimizing the scores of negative samples $x'$. This iterative process facilitates the acquisition of meaningful embeddings that capture the semantic relationships between entities and relations in the knowledge graph.
  • Figure 3: Timeline of Negative Sample in Knowledge Graph Representation Learning. Dotted arrows indicate that the target method extends the source method.
  • Figure 4: Overview of subcategories of static negative sampling methods and steps of training a knowledge graph embedding model over a positive instance $x=(h, r, t)$ and a corrupted negative $x'$ from different static negative sampling methods.
  • Figure 5: Overview of subcategories of dynamic negative sampling methods and steps of training a knowledge graph embedding model over a positive instance $x=(h, r, t)$ and a corrupted negative $x'$ from different dynamic negative sampling methods.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Definition 1
  • Definition 2
  • Definition 3