Table of Contents
Fetching ...

Fact or Fiction? Improving Fact Verification with Knowledge Graphs through Simplified Subgraph Retrievals

Tobias A. Opsahl

TL;DR

This paper tackles automatic fact verification using structured knowledge graphs by evaluating FactKG from DBpedia with three modeling paradigms: textual fine-tuning, a Hybrid QA-GNN, and ChatGPT prompting. It demonstrates that simple, non-trainable subgraph retrieval strategies, especially single-step retrieval, can achieve high accuracy (up to 93.49% on the test set) and substantially reduce training time compared to prior work. The results indicate that complex subgraph retrieval may be unnecessary for strong performance and highlight the potential of KG-based evidence with efficient retrieval, while also revealing challenges in multi-hop reasoning and reproducibility for LLM-based approaches. The findings suggest practical implications for scalable, evidence-grounded fact verification and point to future work on deeper subgraphs, other datasets, and hybrid LLM-KG systems.

Abstract

Despite recent success in natural language processing (NLP), fact verification still remains a difficult task. Due to misinformation spreading increasingly fast, attention has been directed towards automatically verifying the correctness of claims. In the domain of NLP, this is usually done by training supervised machine learning models to verify claims by utilizing evidence from trustworthy corpora. We present efficient methods for verifying claims on a dataset where the evidence is in the form of structured knowledge graphs. We use the FactKG dataset, which is constructed from the DBpedia knowledge graph extracted from Wikipedia. By simplifying the evidence retrieval process, from fine-tuned language models to simple logical retrievals, we are able to construct models that both require less computational resources and achieve better test-set accuracy.

Fact or Fiction? Improving Fact Verification with Knowledge Graphs through Simplified Subgraph Retrievals

TL;DR

This paper tackles automatic fact verification using structured knowledge graphs by evaluating FactKG from DBpedia with three modeling paradigms: textual fine-tuning, a Hybrid QA-GNN, and ChatGPT prompting. It demonstrates that simple, non-trainable subgraph retrieval strategies, especially single-step retrieval, can achieve high accuracy (up to 93.49% on the test set) and substantially reduce training time compared to prior work. The results indicate that complex subgraph retrieval may be unnecessary for strong performance and highlight the potential of KG-based evidence with efficient retrieval, while also revealing challenges in multi-hop reasoning and reproducibility for LLM-based approaches. The findings suggest practical implications for scalable, evidence-grounded fact verification and point to future work on deeper subgraphs, other datasets, and hybrid LLM-KG systems.

Abstract

Despite recent success in natural language processing (NLP), fact verification still remains a difficult task. Due to misinformation spreading increasingly fast, attention has been directed towards automatically verifying the correctness of claims. In the domain of NLP, this is usually done by training supervised machine learning models to verify claims by utilizing evidence from trustworthy corpora. We present efficient methods for verifying claims on a dataset where the evidence is in the form of structured knowledge graphs. We use the FactKG dataset, which is constructed from the DBpedia knowledge graph extracted from Wikipedia. By simplifying the evidence retrieval process, from fine-tuned language models to simple logical retrievals, we are able to construct models that both require less computational resources and achieve better test-set accuracy.
Paper Structure (21 sections, 3 figures, 5 tables)

This paper contains 21 sections, 3 figures, 5 tables.

Figures (3)

  • Figure 1: An example claim from FactKGkim2023factkg. The claim can be verified or refuted based on the DBpedia KG lehmann2015dbpedia. This is Figure 1 from kim2023factkg.
  • Figure 2: Examples of the different subgraphs explored in this article. Boxes and bold letters represent entities, while arrows and italic letters represent relations. This claim is meant for illustrative purposes and is not present in the FactKG dataset.
  • Figure 3: Final prompt used to get truth values from ChatGPT 4o. The actual questions are not included, but were in the format of the Example Claims. The Example Claims are from the training set, and the Example Output is copy pasted from an actual ChatGPT answer.