Table of Contents
Fetching ...

Optimal Embedding Guided Negative Sample Generation for Knowledge Graph Link Prediction

Makoto Takamoto, Daniel Oñoro-Rubio, Wiem Ben Rim, Takashi Maruyama, Bhushan Kotnis

TL;DR

The paper tackles the challenge of training knowledge graph embeddings for link prediction by focusing on high-quality negative sampling. It introduces Embedding MUtation (EMU), a simple yet principled method that generates negative tails by mutating embedding components toward the positive tail, guided by a theoretical condition for near-optimal embedding. The authors show, both theoretically and empirically, that EMU yields an approximately isotropic negative distribution around positives and delivers consistent performance gains across multiple KGE models and datasets, often matching the performance of much larger embedding dimensions. EMU is shown to be compatible with existing sampling strategies and scalable to state-of-the-art models like NBFNet, providing practical gains with reduced computational requirements. The work offers a solid combination of theory and experiments, demonstrating EMU’s potential to improve KG link prediction broadly and suggesting extensions to other graph representation tasks.

Abstract

Knowledge graph embedding (KGE) models encode the structural information of knowledge graphs to predicting new links. Effective training of these models requires distinguishing between positive and negative samples with high precision. Although prior research has shown that improving the quality of negative samples can significantly enhance model accuracy, identifying high-quality negative samples remains a challenging problem. This paper theoretically investigates the condition under which negative samples lead to optimal KG embedding and identifies a sufficient condition for an effective negative sample distribution. Based on this theoretical foundation, we propose \textbf{E}mbedding \textbf{MU}tation (\textsc{EMU}), a novel framework that \emph{generates} negative samples satisfying this condition, in contrast to conventional methods that focus on \emph{identifying} challenging negative samples within the training data. Importantly, the simplicity of \textsc{EMU} ensures seamless integration with existing KGE models and negative sampling methods. To evaluate its efficacy, we conducted comprehensive experiments across multiple datasets. The results consistently demonstrate significant improvements in link prediction performance across various KGE models and negative sampling methods. Notably, \textsc{EMU} enables performance improvements comparable to those achieved by models with embedding dimension five times larger. An implementation of the method and experiments are available at https://github.com/nec-research/EMU-KG.

Optimal Embedding Guided Negative Sample Generation for Knowledge Graph Link Prediction

TL;DR

The paper tackles the challenge of training knowledge graph embeddings for link prediction by focusing on high-quality negative sampling. It introduces Embedding MUtation (EMU), a simple yet principled method that generates negative tails by mutating embedding components toward the positive tail, guided by a theoretical condition for near-optimal embedding. The authors show, both theoretically and empirically, that EMU yields an approximately isotropic negative distribution around positives and delivers consistent performance gains across multiple KGE models and datasets, often matching the performance of much larger embedding dimensions. EMU is shown to be compatible with existing sampling strategies and scalable to state-of-the-art models like NBFNet, providing practical gains with reduced computational requirements. The work offers a solid combination of theory and experiments, demonstrating EMU’s potential to improve KG link prediction broadly and suggesting extensions to other graph representation tasks.

Abstract

Knowledge graph embedding (KGE) models encode the structural information of knowledge graphs to predicting new links. Effective training of these models requires distinguishing between positive and negative samples with high precision. Although prior research has shown that improving the quality of negative samples can significantly enhance model accuracy, identifying high-quality negative samples remains a challenging problem. This paper theoretically investigates the condition under which negative samples lead to optimal KG embedding and identifies a sufficient condition for an effective negative sample distribution. Based on this theoretical foundation, we propose \textbf{E}mbedding \textbf{MU}tation (\textsc{EMU}), a novel framework that \emph{generates} negative samples satisfying this condition, in contrast to conventional methods that focus on \emph{identifying} challenging negative samples within the training data. Importantly, the simplicity of \textsc{EMU} ensures seamless integration with existing KGE models and negative sampling methods. To evaluate its efficacy, we conducted comprehensive experiments across multiple datasets. The results consistently demonstrate significant improvements in link prediction performance across various KGE models and negative sampling methods. Notably, \textsc{EMU} enables performance improvements comparable to those achieved by models with embedding dimension five times larger. An implementation of the method and experiments are available at https://github.com/nec-research/EMU-KG.

Paper Structure

This paper contains 43 sections, 6 theorems, 22 equations, 6 figures, 15 tables.

Key Result

Theorem 3.1

Assuming DistMult model, an empirical realization of the covariance measure can be given as, where $\mathbf{v}^{hr} = \mathbf{z}^h \odot \mathbf{z}^r$ and $\mathbf{z}^h, \mathbf{z}^r, \mathbf{z}^t$ are the head, relation, and tail embedding vectors, respectively.

Figures (6)

  • Figure 1: EMU generates a new negative samples with embedding mutation. The figure illustrates a typical example that generate hard negative tails.
  • Figure 2: MRR for the datasets: FB15k-237, YAGO3-10, and WN18RR. The blue, orange, green, and red colored bars mean the result of using the following negative sampling methods: "SAN", "SAN with EMU", "uniform", and "uniform with EMU", respectively.
  • Figure 3: Cosine similarity between positive and negative sample pair for DistMult trained on FB15k-237 dataset. The used negative samples are: uniform, EMU, and SAN. The larger, the more similar.
  • Figure 4: Results of the analysis of EMU of DistMult model trained on FB15k-237 (Left) and WN18rr (Right) dataset. Left: The distribution of real-tail and uniformly-sampled negative-tail in terms of the 1st and 2nd PCA components. Right: The distribution of real-tail and EMU negative-tail in terms of the 1st and 2nd PCA components.
  • Figure 5: The negative sample number dependence of MRR trained on FB15K-237. The right-edge of the ComplEX and DistMult of the uniform negative sampling case is the "1 VS ALL" results.
  • ...and 1 more figures

Theorems & Definitions (12)

  • Theorem 3.1
  • proof
  • Theorem 3.2
  • Theorem 3.3
  • proof
  • Lemma 3.4
  • Theorem 3.5
  • proof
  • Proposition B.1
  • proof
  • ...and 2 more