Table of Contents
Fetching ...

DANS-KGC: Diffusion Based Adaptive Negative Sampling for Knowledge Graph Completion

Haoning Li, Qinghua Huang

TL;DR

This paper addresses the limitations of conventional negative sampling in knowledge graph completion by introducing DANS-KGC, a diffusion-based framework conditioned on per-entity learning difficulty. It comprises three components: DAM to compute entity difficulty, ANS to generate varying-hard negatives via a difficulty-aware diffusion process with forward noise scheduling and constraint-guided reverse denoising, and DTM to implement a curriculum that shifts from easy to hard negatives during training. Empirical results on six datasets show state-of-the-art performance on UMLS and YAGO3-10, with strong generalization across diverse KG domains. The approach yields richer negative samples and a dynamic training regimen that together enhance discriminative learning and link-prediction accuracy.

Abstract

Negative sampling (NS) strategies play a crucial role in knowledge graph representation. In order to overcome the limitations of existing negative sampling strategies, such as vulnerability to false negatives, limited generalization, and lack of control over sample hardness, we propose DANS-KGC (Diffusion-based Adaptive Negative Sampling for Knowledge Graph Completion). DANS-KGC comprises three key components: the Difficulty Assessment Module (DAM), the Adaptive Negative Sampling Module (ANS), and the Dynamic Training Mechanism (DTM). DAM evaluates the learning difficulty of entities by integrating semantic and structural features. Based on this assessment, ANS employs a conditional diffusion model with difficulty-aware noise scheduling, leveraging semantic and neighborhood information during the denoising phase to generate negative samples of diverse hardness. DTM further enhances learning by dynamically adjusting the hardness distribution of negative samples throughout training, enabling a curriculum-style progression from easy to hard examples. Extensive experiments on six benchmark datasets demonstrate the effectiveness and generalization ability of DANS-KGC, with the method achieving state-of-the-art results on all three evaluation metrics for the UMLS and YAGO3-10 datasets.

DANS-KGC: Diffusion Based Adaptive Negative Sampling for Knowledge Graph Completion

TL;DR

This paper addresses the limitations of conventional negative sampling in knowledge graph completion by introducing DANS-KGC, a diffusion-based framework conditioned on per-entity learning difficulty. It comprises three components: DAM to compute entity difficulty, ANS to generate varying-hard negatives via a difficulty-aware diffusion process with forward noise scheduling and constraint-guided reverse denoising, and DTM to implement a curriculum that shifts from easy to hard negatives during training. Empirical results on six datasets show state-of-the-art performance on UMLS and YAGO3-10, with strong generalization across diverse KG domains. The approach yields richer negative samples and a dynamic training regimen that together enhance discriminative learning and link-prediction accuracy.

Abstract

Negative sampling (NS) strategies play a crucial role in knowledge graph representation. In order to overcome the limitations of existing negative sampling strategies, such as vulnerability to false negatives, limited generalization, and lack of control over sample hardness, we propose DANS-KGC (Diffusion-based Adaptive Negative Sampling for Knowledge Graph Completion). DANS-KGC comprises three key components: the Difficulty Assessment Module (DAM), the Adaptive Negative Sampling Module (ANS), and the Dynamic Training Mechanism (DTM). DAM evaluates the learning difficulty of entities by integrating semantic and structural features. Based on this assessment, ANS employs a conditional diffusion model with difficulty-aware noise scheduling, leveraging semantic and neighborhood information during the denoising phase to generate negative samples of diverse hardness. DTM further enhances learning by dynamically adjusting the hardness distribution of negative samples throughout training, enabling a curriculum-style progression from easy to hard examples. Extensive experiments on six benchmark datasets demonstrate the effectiveness and generalization ability of DANS-KGC, with the method achieving state-of-the-art results on all three evaluation metrics for the UMLS and YAGO3-10 datasets.

Paper Structure

This paper contains 25 sections, 19 equations, 2 figures, 5 tables, 1 algorithm.

Figures (2)

  • Figure 1: The overall framework of DANS-KGC.
  • Figure 2: The sensitivity analysis results of parameters $\mu$ and $\eta$ on the Family dataset.