ORKG ASK: a Neuro-symbolic Scholarly Search and Exploration System
Allard Oelen, Mohamad Yaser Jaradeh, Sören Auer
TL;DR
The paper addresses the inefficiency of locating and extracting knowledge from the expanding body of scholarly literature. It proposes ORKG ASK, a neuro-symbolic system that combines semantic search, LLM-based QA with context from retrieved articles, and knowledge graphs for fine-grained extraction and filtering. Using a Retrieval-Augmented Generation workflow with vector stores (Nomic embeddings, Qdrant), DBpedia Spotlight for entity linking, and the CORE dataset, the system provides synthesized answers and structured article metadata. Preliminary usability evaluation suggests the interface is generally easy to use, with plans to add provenance and expand the knowledge graph to improve reproducibility and scalability.
Abstract
Purpose: Finding scholarly articles is a time-consuming and cumbersome activity, yet crucial for conducting science. Due to the growing number of scholarly articles, new scholarly search systems are needed to effectively assist researchers in finding relevant literature. Methodology: We take a neuro-symbolic approach to scholarly search and exploration by leveraging state-of-the-art components, including semantic search, Large Language Models (LLMs), and Knowledge Graphs (KGs). The semantic search component composes a set of relevant articles. From this set of articles, information is extracted and presented to the user. Findings: The presented system, called ORKG ASK (Assistant for Scientific Knowledge), provides a production-ready search and exploration system. Our preliminary evaluation indicates that our proposed approach is indeed suitable for the task of scholarly information retrieval. Value: With ORKG ASK, we present a next-generation scholarly search and exploration system and make it available online. Additionally, the system components are open source with a permissive license.
