Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT

Ahan Bhatt; Nandan Vaghela; Kush Dudhia

Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT

Ahan Bhatt, Nandan Vaghela, Kush Dudhia

TL;DR

This paper addresses automated knowledge-graph generation for GraphRAG systems by comparing GPT-4, LLaMA 2 (13B), and BERT on a compact unstructured text source. It introduces a seven-step methodology—from data selection and preprocessing to KG generation, ground-truth creation, and multi-metric evaluation including Precision, Recall, F1, Graph Edit Distance ($GED$), and Semantic Similarity. The results show GPT-4 achieves the highest semantic fidelity and structural alignment (F1 up to 0.82; $GED$ of 6; Overall Similarity 0.87), while LLaMA 2 offers a resource-efficient balance and BERT lags in complex entity–relation modeling. The findings demonstrate the feasibility of LLM-driven KG generation for GraphRAGs and lay groundwork for scalable, accurate KG construction in real-world applications.

Abstract

Knowledge Graphs (KGs) are essential for the functionality of GraphRAGs, a form of Retrieval-Augmented Generative Systems (RAGs) that excel in tasks requiring structured reasoning and semantic understanding. However, creating KGs for GraphRAGs remains a significant challenge due to accuracy and scalability limitations of traditional methods. This paper introduces a novel approach leveraging large language models (LLMs) like GPT-4, LLaMA 2 (13B), and BERT to generate KGs directly from unstructured data, bypassing traditional pipelines. Using metrics such as Precision, Recall, F1-Score, Graph Edit Distance, and Semantic Similarity, we evaluate the models' ability to generate high-quality KGs. Results demonstrate that GPT-4 achieves superior semantic fidelity and structural accuracy, LLaMA 2 excels in lightweight, domain-specific graphs, and BERT provides insights into challenges in entity-relationship modeling. This study underscores the potential of LLMs to streamline KG creation and enhance GraphRAG accessibility for real-world applications, while setting a foundation for future advancements.

Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT

TL;DR

Abstract

Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)