Table of Contents
Fetching ...

Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

Runsong Jia, Bowen Zhang, Sergio J. Rodríguez Méndez, Pouya G. Omran

TL;DR

The paper tackles the challenge of fine-grained semantic query processing over scholarly knowledge graphs by focusing on CS research artifacts at ANU. It proposes a framework that integrates the Deep Document Model (DDM) and the Document Object Model Ontology (DOMO) with KG-enhanced Query Processing (KGQP) and LLMs, including automatic LLM-SPARQL fusion for retrieving facts and textual nodes from ASKG. Through extensive experiments, the approach demonstrates improved KG construction accuracy and query efficiency, supported by human evaluation and embedding-distance analyses that show better relevance and diversity in KG-based QA. The work has practical implications for scholarly knowledge management and discovery, enabling precise, reliable interactions with LLMs and laying the groundwork for future multimodal and dynamic knowledge graphs in academic contexts.

Abstract

The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related artifacts produced at ANU in the CS field. Each artifact and its parts are represented as textual nodes stored in a Knowledge Graph (KG). To address the limitations of traditional scholarly KG construction and utilization methods, which often fail to capture fine-grained details, we propose a novel framework that integrates the Deep Document Model (DDM) for comprehensive document representation and the KG-enhanced Query Processing (KGQP) for optimized complex query handling. DDM enables a fine-grained representation of the hierarchical structure and semantic relationships within academic papers, while KGQP leverages the KG structure to improve query accuracy and efficiency with LLMs. By combining the ASKG with LLMs, our approach enhances knowledge utilization and natural language understanding capabilities. The proposed system employs an automatic LLM-SPARQL fusion to retrieve relevant facts and textual nodes from the ASKG. Initial experiments demonstrate that our framework is superior to baseline methods in terms of accuracy retrieval and query efficiency. We showcase the practical application of our framework in academic research scenarios, highlighting its potential to revolutionize scholarly knowledge management and discovery. This work empowers researchers to acquire and utilize knowledge from documents more effectively and provides a foundation for developing precise and reliable interactions with LLMs.

Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

TL;DR

The paper tackles the challenge of fine-grained semantic query processing over scholarly knowledge graphs by focusing on CS research artifacts at ANU. It proposes a framework that integrates the Deep Document Model (DDM) and the Document Object Model Ontology (DOMO) with KG-enhanced Query Processing (KGQP) and LLMs, including automatic LLM-SPARQL fusion for retrieving facts and textual nodes from ASKG. Through extensive experiments, the approach demonstrates improved KG construction accuracy and query efficiency, supported by human evaluation and embedding-distance analyses that show better relevance and diversity in KG-based QA. The work has practical implications for scholarly knowledge management and discovery, enabling precise, reliable interactions with LLMs and laying the groundwork for future multimodal and dynamic knowledge graphs in academic contexts.

Abstract

The proposed research aims to develop an innovative semantic query processing system that enables users to obtain comprehensive information about research works produced by Computer Science (CS) researchers at the Australian National University (ANU). The system integrates Large Language Models (LLMs) with the ANU Scholarly Knowledge Graph (ASKG), a structured repository of all research-related artifacts produced at ANU in the CS field. Each artifact and its parts are represented as textual nodes stored in a Knowledge Graph (KG). To address the limitations of traditional scholarly KG construction and utilization methods, which often fail to capture fine-grained details, we propose a novel framework that integrates the Deep Document Model (DDM) for comprehensive document representation and the KG-enhanced Query Processing (KGQP) for optimized complex query handling. DDM enables a fine-grained representation of the hierarchical structure and semantic relationships within academic papers, while KGQP leverages the KG structure to improve query accuracy and efficiency with LLMs. By combining the ASKG with LLMs, our approach enhances knowledge utilization and natural language understanding capabilities. The proposed system employs an automatic LLM-SPARQL fusion to retrieve relevant facts and textual nodes from the ASKG. Initial experiments demonstrate that our framework is superior to baseline methods in terms of accuracy retrieval and query efficiency. We showcase the practical application of our framework in academic research scenarios, highlighting its potential to revolutionize scholarly knowledge management and discovery. This work empowers researchers to acquire and utilize knowledge from documents more effectively and provides a foundation for developing precise and reliable interactions with LLMs.
Paper Structure (30 sections, 5 equations, 4 figures, 4 tables)

This paper contains 30 sections, 5 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Document Object Model Ontology (DOMO)
  • Figure 2: Deep Document Model (DDM)
  • Figure 3: Deep Document Model Pipeline
  • Figure 4: LLMs Interaction Flow Chart