Knowledge Graph Enrichment and Reasoning for Nobel Laureates
Thanh-Lam T. Nguyen, Ngoc-Quang Le, Thu-Trang Pham, Mai-Vu Tran
TL;DR
The paper presents an end-to-end pipeline to construct and analyze a Nobel Prize knowledge graph by enriching Wikipedia biographies with NER/RE derived from LLMs. It expands the graph with Notable_Work, Event, and Location entities, and applies social network analyses to reveal small-world characteristics and influential hubs. A GraphRAG-based chatbot with a fine-tuned Text2Cypher component enables natural-language querying and multi-hop reasoning over the KG. The work includes extensive experiments, a large evaluation dataset, and released data/code, highlighting the value of integrating LLM-driven extraction with graph-based reasoning for domain-specific knowledge discovery.
Abstract
This project aims to construct and analyze a comprehensive knowledge graph of Nobel Prize and Laureates by enriching existing datasets with biographical information extracted from Wikipedia. Our approach integrates multiple advanced techniques, consisting of automatic data augmentation using LLMs for Named Entity Recognition (NER) and Relation Extraction (RE) tasks, and social network analysis to uncover hidden patterns within the scientific community. Furthermore, we also develop a GraphRAG-based chatbot system utilizing a fine-tuned model for Text2Cypher translation, enabling natural language querying over the knowledge graph. Experimental results demonstrate that the enriched graph possesses small-world network properties, identifying key influential figures and central organizations. The chatbot system achieves a competitive accuracy on a custom multiple-choice evaluation dataset, proving the effectiveness of combining LLMs with structured knowledge bases for complex reasoning tasks. Data and source code are available at: https://github.com/tlam25/network-of-awards-and-winners.
