Table of Contents
Fetching ...

Learning Rules from KGs Guided by Language Models

Zihang Peng, Daria Stepanova, Vinh Thinh Ho, Heike Adel, Alessandra Russo, Simon Ott

TL;DR

This work tackles the challenge of learning and ranking rules over incomplete knowledge graphs by leveraging language models as an external predictive scorer. It introduces a hybrid rule quality function $\\mu(r) = (1-\\lambda) \\mu_1(r) + \\lambda \\mu_2(r)$ that combines traditional descriptive metrics with LM-based guidance, where $\\mu_2$ relies on the LM’s reciprocal rank for probable completions. A system prototype demonstrates that using LMs to score rules via prompts can improve top-k rule precision on Wiki44K, even without KG-specific fine-tuning of the LM. The findings suggest that LM-guided ranking can enhance KG completion and rule-learning systems, offering a flexible, plug-in improvement for existing rule learners.

Abstract

Advances in information extraction have enabled the automatic construction of large knowledge graphs (e.g., Yago, Wikidata or Google KG), which are widely used in many applications like semantic search or data analytics. However, due to their semi-automatic construction, KGs are often incomplete. Rule learning methods, concerned with the extraction of frequent patterns from KGs and casting them into rules, can be applied to predict potentially missing facts. A crucial step in this process is rule ranking. Ranking of rules is especially challenging over highly incomplete or biased KGs (e.g., KGs predominantly storing facts about famous people), as in this case biased rules might fit the data best and be ranked at the top based on standard statistical metrics like rule confidence. To address this issue, prior works proposed to rank rules not only relying on the original KG but also facts predicted by a KG embedding model. At the same time, with the recent rise of Language Models (LMs), several works have claimed that LMs can be used as alternative means for KG completion. In this work, our goal is to verify to which extent the exploitation of LMs is helpful for improving the quality of rule learning systems.

Learning Rules from KGs Guided by Language Models

TL;DR

This work tackles the challenge of learning and ranking rules over incomplete knowledge graphs by leveraging language models as an external predictive scorer. It introduces a hybrid rule quality function that combines traditional descriptive metrics with LM-based guidance, where relies on the LM’s reciprocal rank for probable completions. A system prototype demonstrates that using LMs to score rules via prompts can improve top-k rule precision on Wiki44K, even without KG-specific fine-tuning of the LM. The findings suggest that LM-guided ranking can enhance KG completion and rule-learning systems, offering a flexible, plug-in improvement for existing rule learners.

Abstract

Advances in information extraction have enabled the automatic construction of large knowledge graphs (e.g., Yago, Wikidata or Google KG), which are widely used in many applications like semantic search or data analytics. However, due to their semi-automatic construction, KGs are often incomplete. Rule learning methods, concerned with the extraction of frequent patterns from KGs and casting them into rules, can be applied to predict potentially missing facts. A crucial step in this process is rule ranking. Ranking of rules is especially challenging over highly incomplete or biased KGs (e.g., KGs predominantly storing facts about famous people), as in this case biased rules might fit the data best and be ranked at the top based on standard statistical metrics like rule confidence. To address this issue, prior works proposed to rank rules not only relying on the original KG but also facts predicted by a KG embedding model. At the same time, with the recent rise of Language Models (LMs), several works have claimed that LMs can be used as alternative means for KG completion. In this work, our goal is to verify to which extent the exploitation of LMs is helpful for improving the quality of rule learning systems.
Paper Structure (6 sections, 1 equation, 2 figures)

This paper contains 6 sections, 1 equation, 2 figures.

Figures (2)

  • Figure 1: System Overview.
  • Figure 2: Average precision of predictions made by rules computed relying on the language model Bert.