Table of Contents
Fetching ...

OMNIA: Closing the Loop by Leveraging LLMs for Knowledge Graph Completion

Frédéric Ieng, Soror Sahri, Mourad Ouzzani, Massinissa Hammaz, Salima Benbernou, Hanieh Khorashadizadeh, Sven Groppe, Farah Benamara

Abstract

Knowledge Graphs (KGs) are widely used to represent structured knowledge, yet their automatic construction, especially with Large Language Models (LLMs), often results in incomplete or noisy outputs. Knowledge Graph Completion (KGC) aims to infer and add missing triples, but most existing methods either rely on structural embeddings that overlook semantics or language models that ignore the graph's structure and depend on external sources. In this work, we present OMNIA, a two-stage approach that bridges structural and semantic reasoning for KGC. It first generates candidate triples by clustering semantically related entities and relations within the KG, then validates them through lightweight embedding filtering followed by LLM-based semantic validation. OMNIA performs on the internal KG, without external sources, and specifically targets implicit semantics that are most frequent in LLM-generated graphs. Extensive experiments on multiple datasets demonstrate that OMNIA significantly improves F1-score compared to traditional embedding-based models. These results highlight OMNIA's effectiveness and efficiency, as its clustering and filtering stages reduce both search space and validation cost while maintaining high-quality completion.

OMNIA: Closing the Loop by Leveraging LLMs for Knowledge Graph Completion

Abstract

Knowledge Graphs (KGs) are widely used to represent structured knowledge, yet their automatic construction, especially with Large Language Models (LLMs), often results in incomplete or noisy outputs. Knowledge Graph Completion (KGC) aims to infer and add missing triples, but most existing methods either rely on structural embeddings that overlook semantics or language models that ignore the graph's structure and depend on external sources. In this work, we present OMNIA, a two-stage approach that bridges structural and semantic reasoning for KGC. It first generates candidate triples by clustering semantically related entities and relations within the KG, then validates them through lightweight embedding filtering followed by LLM-based semantic validation. OMNIA performs on the internal KG, without external sources, and specifically targets implicit semantics that are most frequent in LLM-generated graphs. Extensive experiments on multiple datasets demonstrate that OMNIA significantly improves F1-score compared to traditional embedding-based models. These results highlight OMNIA's effectiveness and efficiency, as its clustering and filtering stages reduce both search space and validation cost while maintaining high-quality completion.
Paper Structure (27 sections, 1 equation, 6 figures, 7 tables, 3 algorithms)

This paper contains 27 sections, 1 equation, 6 figures, 7 tables, 3 algorithms.

Figures (6)

  • Figure 1: Example of a missing triple in a LLM-generated knowledge graph.
  • Figure 2: Example of the Clustering-based triple candidate generation.
  • Figure 3: Overview of OMNIA with its two main steps
  • Figure 4: Example of Entity Clustering
  • Figure 5: Triple-based scenario with its three prompting cases for LLM-based validation
  • ...and 1 more figures