Table of Contents
Fetching ...

ORKG ASK: a Neuro-symbolic Scholarly Search and Exploration System

Allard Oelen, Mohamad Yaser Jaradeh, Sören Auer

TL;DR

The paper addresses the inefficiency of locating and extracting knowledge from the expanding body of scholarly literature. It proposes ORKG ASK, a neuro-symbolic system that combines semantic search, LLM-based QA with context from retrieved articles, and knowledge graphs for fine-grained extraction and filtering. Using a Retrieval-Augmented Generation workflow with vector stores (Nomic embeddings, Qdrant), DBpedia Spotlight for entity linking, and the CORE dataset, the system provides synthesized answers and structured article metadata. Preliminary usability evaluation suggests the interface is generally easy to use, with plans to add provenance and expand the knowledge graph to improve reproducibility and scalability.

Abstract

Purpose: Finding scholarly articles is a time-consuming and cumbersome activity, yet crucial for conducting science. Due to the growing number of scholarly articles, new scholarly search systems are needed to effectively assist researchers in finding relevant literature. Methodology: We take a neuro-symbolic approach to scholarly search and exploration by leveraging state-of-the-art components, including semantic search, Large Language Models (LLMs), and Knowledge Graphs (KGs). The semantic search component composes a set of relevant articles. From this set of articles, information is extracted and presented to the user. Findings: The presented system, called ORKG ASK (Assistant for Scientific Knowledge), provides a production-ready search and exploration system. Our preliminary evaluation indicates that our proposed approach is indeed suitable for the task of scholarly information retrieval. Value: With ORKG ASK, we present a next-generation scholarly search and exploration system and make it available online. Additionally, the system components are open source with a permissive license.

ORKG ASK: a Neuro-symbolic Scholarly Search and Exploration System

TL;DR

The paper addresses the inefficiency of locating and extracting knowledge from the expanding body of scholarly literature. It proposes ORKG ASK, a neuro-symbolic system that combines semantic search, LLM-based QA with context from retrieved articles, and knowledge graphs for fine-grained extraction and filtering. Using a Retrieval-Augmented Generation workflow with vector stores (Nomic embeddings, Qdrant), DBpedia Spotlight for entity linking, and the CORE dataset, the system provides synthesized answers and structured article metadata. Preliminary usability evaluation suggests the interface is generally easy to use, with plans to add provenance and expand the knowledge graph to improve reproducibility and scalability.

Abstract

Purpose: Finding scholarly articles is a time-consuming and cumbersome activity, yet crucial for conducting science. Due to the growing number of scholarly articles, new scholarly search systems are needed to effectively assist researchers in finding relevant literature. Methodology: We take a neuro-symbolic approach to scholarly search and exploration by leveraging state-of-the-art components, including semantic search, Large Language Models (LLMs), and Knowledge Graphs (KGs). The semantic search component composes a set of relevant articles. From this set of articles, information is extracted and presented to the user. Findings: The presented system, called ORKG ASK (Assistant for Scientific Knowledge), provides a production-ready search and exploration system. Our preliminary evaluation indicates that our proposed approach is indeed suitable for the task of scholarly information retrieval. Value: With ORKG ASK, we present a next-generation scholarly search and exploration system and make it available online. Additionally, the system components are open source with a permissive license.

Paper Structure

This paper contains 7 sections, 4 figures.

Figures (4)

  • Figure 1: Design of the search result page of the ORKG ASK application.
  • Figure 2: ORKG ASK system workflow integrating neuro-symbolic components.
  • Figure 3: Results for user satisfaction evaluation indicating relatively satisfied users.
  • Figure 4: Results for UMUX-Lite evaluation with a total score of 65.2.