Table of Contents
Fetching ...

Unified Interpretation of Smoothing Methods for Negative Sampling Loss Functions in Knowledge Graph Embedding

Xincan Feng, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

TL;DR

This work addresses the sparsity-induced challenges in training knowledge graph embeddings by offering a unified interpretation of smoothing methods for negative sampling loss. It introduces Triplet Adaptive Negative Sampling (TANS), which models the joint probability of query-triplet pairs and integrates model-based and count-based smoothing, effectively subsuming existing approaches like SANS and subsampling. The authors provide a theoretical framework and demonstrate empirically that TANS (especially when combined with subsampling) improves link prediction performance across multiple KGE models and standard datasets, with notable gains on sparser graphs. The findings offer a principled pathway to select or combine smoothing strategies in KG embedding, with practical implications for scalable, robust KGE training.

Abstract

Knowledge Graphs (KGs) are fundamental resources in knowledge-intensive tasks in NLP. Due to the limitation of manually creating KGs, KG Completion (KGC) has an important role in automatically completing KGs by scoring their links with KG Embedding (KGE). To handle many entities in training, KGE relies on Negative Sampling (NS) loss that can reduce the computational cost by sampling. Since the appearance frequencies for each link are at most one in KGs, sparsity is an essential and inevitable problem. The NS loss is no exception. As a solution, the NS loss in KGE relies on smoothing methods like Self-Adversarial Negative Sampling (SANS) and subsampling. However, it is uncertain what kind of smoothing method is suitable for this purpose due to the lack of theoretical understanding. This paper provides theoretical interpretations of the smoothing methods for the NS loss in KGE and induces a new NS loss, Triplet Adaptive Negative Sampling (TANS), that can cover the characteristics of the conventional smoothing methods. Experimental results of TransE, DistMult, ComplEx, RotatE, HAKE, and HousE on FB15k-237, WN18RR, and YAGO3-10 datasets and their sparser subsets show the soundness of our interpretation and performance improvement by our TANS.

Unified Interpretation of Smoothing Methods for Negative Sampling Loss Functions in Knowledge Graph Embedding

TL;DR

This work addresses the sparsity-induced challenges in training knowledge graph embeddings by offering a unified interpretation of smoothing methods for negative sampling loss. It introduces Triplet Adaptive Negative Sampling (TANS), which models the joint probability of query-triplet pairs and integrates model-based and count-based smoothing, effectively subsuming existing approaches like SANS and subsampling. The authors provide a theoretical framework and demonstrate empirically that TANS (especially when combined with subsampling) improves link prediction performance across multiple KGE models and standard datasets, with notable gains on sparser graphs. The findings offer a principled pathway to select or combine smoothing strategies in KG embedding, with practical implications for scalable, robust KGE training.

Abstract

Knowledge Graphs (KGs) are fundamental resources in knowledge-intensive tasks in NLP. Due to the limitation of manually creating KGs, KG Completion (KGC) has an important role in automatically completing KGs by scoring their links with KG Embedding (KGE). To handle many entities in training, KGE relies on Negative Sampling (NS) loss that can reduce the computational cost by sampling. Since the appearance frequencies for each link are at most one in KGs, sparsity is an essential and inevitable problem. The NS loss is no exception. As a solution, the NS loss in KGE relies on smoothing methods like Self-Adversarial Negative Sampling (SANS) and subsampling. However, it is uncertain what kind of smoothing method is suitable for this purpose due to the lack of theoretical understanding. This paper provides theoretical interpretations of the smoothing methods for the NS loss in KGE and induces a new NS loss, Triplet Adaptive Negative Sampling (TANS), that can cover the characteristics of the conventional smoothing methods. Experimental results of TransE, DistMult, ComplEx, RotatE, HAKE, and HousE on FB15k-237, WN18RR, and YAGO3-10 datasets and their sparser subsets show the soundness of our interpretation and performance improvement by our TANS.
Paper Structure (37 sections, 13 equations, 8 figures, 10 tables)

This paper contains 37 sections, 13 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: Appearance frequencies of queries and answers (entities) in the training data of FB15k-237, WN18RR, and YAGO3-10. Note that the indices are sorted from high frequency to low.
  • Figure 2: Performances of KGE models HousE, HAKE, RotatE, ComplEx, DistMult, and TransE on datasets FB15k-237, WN18RR, and YAGO3-10 using NS, SANS, and subsampling methods (noted as Base, Freq, Uniq).
  • Figure 3: KGC performance on common KGs (Notations are the same as in Figure \ref{['fig:intro_unified_loss']}).
  • Figure 4: KGC performance on filtered sparser KGs, i.e., FB15k-237-HL, WN18RR-HL, and YAGO3-10-HL (Notations are the same as in Figure \ref{['fig:intro_unified_loss']}).
  • Figure 5: Appearance frequencies of queries and answers (entities) in the training data of the sparser subsets FB15k-237-HL, WN18RR-HL, and YAGO3-10-HL. Note that the indices are sorted from high frequency to low.
  • ...and 3 more figures