Table of Contents
Fetching ...

MuCo-KGC: Multi-Context-Aware Knowledge Graph Completion

Haji Gul, Ajaz Ahmad Bhat, Abdul Ghani Haji Naim

TL;DR

MuCo-KGC tackles tail entity prediction in incomplete knowledge graphs by incorporating three contextual signals from the graph: head context $H_c$, neighbor and relation information, and a global relation context $R_c$, all fed into a BERT-based classifier to predict the tail $t$. It eliminates dependence on entity descriptions and negative sampling, and provides a linear context-computation complexity $O(|T|)$ with training cost $O(3|T|) + O(|T|\cdot|BERT|)$ and inference cost $O(|T|\cdot|BERT|)$. Empirically, it achieves state-of-the-art results on WN18RR (MRR $=0.685$, Hits@1 $=0.637$) and CoDEx-M (MRR $=0.470$), with strong performance on CoDEx-S and competitive results on FB15k-237, while using substantially fewer parameters than methods like SimKGC and DIFT. The findings underscore the value of structural KG context for robust tail prediction and offer a scalable alternative to description-dependent or sampling-heavy KGC approaches, with future work aimed at further enhancing multi-hop and long-range contextual information.

Abstract

Knowledge graph completion (KGC) seeks to predict missing entities (e.g., heads or tails) or relationships in knowledge graphs (KGs), which often contain incomplete data. Traditional embedding-based methods, such as TransE and ComplEx, have improved tail entity prediction but struggle to generalize to unseen entities during testing. Textual-based models mitigate this issue by leveraging additional semantic context; however, their reliance on negative triplet sampling introduces high computational overhead, semantic inconsistencies, and data imbalance. Recent approaches, like KG-BERT, show promise but depend heavily on entity descriptions, which are often unavailable in KGs. Critically, existing methods overlook valuable structural information in the KG related to the entities and relationships. To address these challenges, we propose Multi-Context-Aware Knowledge Graph Completion (MuCo-KGC), a novel model that utilizes contextual information from linked entities and relations within the graph to predict tail entities. MuCo-KGC eliminates the need for entity descriptions and negative triplet sampling, significantly reducing computational complexity while enhancing performance. Our experiments on standard datasets, including FB15k-237, WN18RR, CoDEx-S, and CoDEx-M, demonstrate that MuCo-KGC outperforms state-of-the-art methods on three datasets. Notably, MuCo-KGC improves MRR on WN18RR, and CoDEx-S and CoDEx-M datasets by $1.63\%$, and $3.77\%$ and $20.15\%$ respectively, demonstrating its effectiveness for KGC tasks.

MuCo-KGC: Multi-Context-Aware Knowledge Graph Completion

TL;DR

MuCo-KGC tackles tail entity prediction in incomplete knowledge graphs by incorporating three contextual signals from the graph: head context , neighbor and relation information, and a global relation context , all fed into a BERT-based classifier to predict the tail . It eliminates dependence on entity descriptions and negative sampling, and provides a linear context-computation complexity with training cost and inference cost . Empirically, it achieves state-of-the-art results on WN18RR (MRR , Hits@1 ) and CoDEx-M (MRR ), with strong performance on CoDEx-S and competitive results on FB15k-237, while using substantially fewer parameters than methods like SimKGC and DIFT. The findings underscore the value of structural KG context for robust tail prediction and offer a scalable alternative to description-dependent or sampling-heavy KGC approaches, with future work aimed at further enhancing multi-hop and long-range contextual information.

Abstract

Knowledge graph completion (KGC) seeks to predict missing entities (e.g., heads or tails) or relationships in knowledge graphs (KGs), which often contain incomplete data. Traditional embedding-based methods, such as TransE and ComplEx, have improved tail entity prediction but struggle to generalize to unseen entities during testing. Textual-based models mitigate this issue by leveraging additional semantic context; however, their reliance on negative triplet sampling introduces high computational overhead, semantic inconsistencies, and data imbalance. Recent approaches, like KG-BERT, show promise but depend heavily on entity descriptions, which are often unavailable in KGs. Critically, existing methods overlook valuable structural information in the KG related to the entities and relationships. To address these challenges, we propose Multi-Context-Aware Knowledge Graph Completion (MuCo-KGC), a novel model that utilizes contextual information from linked entities and relations within the graph to predict tail entities. MuCo-KGC eliminates the need for entity descriptions and negative triplet sampling, significantly reducing computational complexity while enhancing performance. Our experiments on standard datasets, including FB15k-237, WN18RR, CoDEx-S, and CoDEx-M, demonstrate that MuCo-KGC outperforms state-of-the-art methods on three datasets. Notably, MuCo-KGC improves MRR on WN18RR, and CoDEx-S and CoDEx-M datasets by , and and respectively, demonstrating its effectiveness for KGC tasks.

Paper Structure

This paper contains 7 sections, 10 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: A concise overview of the MuCo-KGC model pipeline for predicting the tail entity, given a head entity $h$ and a relationship $r$. The box on the left illustrates the calculation of head context $H_c$. $H_c$ is formed as a union of $\mathcal{R}(h)$ and $\mathcal{E}(h)$. Here, $\mathcal{R}(h)$ is the set of all relations ($r1$, $r2$, $r3$ and $r4$) involving the head entity $h$, while $\mathcal{E}(h)$ is the set of all neighboring entities ($e2$, $e3$, $e4$, and $e5$) directly related to $h$. The box on the right shows the calculation of relationship context $R_c$. $R_c$ comprises the set of all entities ($e3$, $e7$, $e2$, and $e6$) associated via relationship $r$. These contextual features --- $H_c$ and $R_c$ --- alongside $h$ and $r$ are then fed as input to the BERT model as depicted in the middle of the figure. The BERT model, combined with a linear classifier and softmax, generates probabilities for tail entities.
  • Figure 2: A comparison of model parameters.