Table of Contents
Fetching ...

NeuroLit Navigator: A Neurosymbolic Approach to Scholarly Article Searches for Systematic Reviews

Vedant Khandelwal, Kaushik Roy, Valerie Lookingbill, Ritvik Garimella, Harshul Surana, Heather Heckman, Amit Sheth

TL;DR

NeuroLit Navigator addresses the critical challenge of constructing precise, reproducible initial literature searches for systematic reviews by fusing domain-specific LLMs with structured biomedical knowledge graphs (MeSH/UMLS) in a neurosymbolic framework. The three-step pipeline — domain-aware NER, vocabulary extension with knowledge graphs, and query expansion with LLMs, followed by retrieval and re-ranking — yields more context-aware queries and higher-relevance results while reducing librarian workload. Empirical evaluation across deployments shows substantial time savings (up to 90%), improved reproducibility, and competitive relevance, with the system outperforming baselines in interpretability and use of controlled vocabulary. The work demonstrates a practical path to robust, domain-aware SR tooling, offering immediate utility to librarians and researchers, and establishes groundwork for multi-iteration refinement and dynamic adaptation to evolving research landscapes.

Abstract

The introduction of Large Language Models (LLMs) has significantly impacted various fields, including education, for example, by enabling the creation of personalized learning materials. However, their use in Systematic Reviews (SRs) reveals limitations such as restricted access to specialized vocabularies, lack of domain-specific reasoning, and a tendency to generate inaccurate information. Existing SR tools often rely on traditional NLP methods and fail to address these issues adequately. To overcome these challenges, we developed the ``NeuroLit Navigator,'' a system that combines domain-specific LLMs with structured knowledge sources like Medical Subject Headings (MeSH) and the Unified Medical Language System (UMLS). This integration enhances query formulation, expands search vocabularies, and deepens search scopes, enabling more precise searches. Deployed in multiple universities and tested by over a dozen librarians, the NeuroLit Navigator has reduced the time required for initial literature searches by 90\%. Despite this efficiency, the initial set of articles retrieved can vary in relevance and quality. Nonetheless, the system has greatly improved the reproducibility of search results, demonstrating its potential to support librarians in the SR process.

NeuroLit Navigator: A Neurosymbolic Approach to Scholarly Article Searches for Systematic Reviews

TL;DR

NeuroLit Navigator addresses the critical challenge of constructing precise, reproducible initial literature searches for systematic reviews by fusing domain-specific LLMs with structured biomedical knowledge graphs (MeSH/UMLS) in a neurosymbolic framework. The three-step pipeline — domain-aware NER, vocabulary extension with knowledge graphs, and query expansion with LLMs, followed by retrieval and re-ranking — yields more context-aware queries and higher-relevance results while reducing librarian workload. Empirical evaluation across deployments shows substantial time savings (up to 90%), improved reproducibility, and competitive relevance, with the system outperforming baselines in interpretability and use of controlled vocabulary. The work demonstrates a practical path to robust, domain-aware SR tooling, offering immediate utility to librarians and researchers, and establishes groundwork for multi-iteration refinement and dynamic adaptation to evolving research landscapes.

Abstract

The introduction of Large Language Models (LLMs) has significantly impacted various fields, including education, for example, by enabling the creation of personalized learning materials. However, their use in Systematic Reviews (SRs) reveals limitations such as restricted access to specialized vocabularies, lack of domain-specific reasoning, and a tendency to generate inaccurate information. Existing SR tools often rely on traditional NLP methods and fail to address these issues adequately. To overcome these challenges, we developed the ``NeuroLit Navigator,'' a system that combines domain-specific LLMs with structured knowledge sources like Medical Subject Headings (MeSH) and the Unified Medical Language System (UMLS). This integration enhances query formulation, expands search vocabularies, and deepens search scopes, enabling more precise searches. Deployed in multiple universities and tested by over a dozen librarians, the NeuroLit Navigator has reduced the time required for initial literature searches by 90\%. Despite this efficiency, the initial set of articles retrieved can vary in relevance and quality. Nonetheless, the system has greatly improved the reproducibility of search results, demonstrating its potential to support librarians in the SR process.

Paper Structure

This paper contains 43 sections, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: The technical architecture demonstrates the three-step process in detail: (1) User input is processed through domain-specific Named Entity Recognition (NER) to extract key terms and phrases from the query and sentinel articles. (2) In parallel, Vocabulary Extension leverages KGs to expand relevant terms within two hops of the extracted concepts. At the same time, Query Expansion uses domain-specific LLMs (such as ClinicalBERT) to replace key terms with related substitutes. (3) The final step involves article retrieval and re-ranking based on semantic similarity to the user query, with the top 5 articles provided for librarian feedback.
  • Figure 2: Updated initial input interface where users enter their research queries. This screen serves as the gateway for users to define the scope of their research by entering queries related to their systematic review, enhanced by the option to include a sentinel article for better context alignment.
  • Figure 3: Display of a retrieved article with key sections highlighted, and serach query. This feature assists users in quickly identifying the relevance of the article to their query, enhancing the review process efficiency.
  • Figure 4: Feedback form for each article. Users can indicate which concepts were missed or misrepresented, thus refining the system's future search accuracy and relevance.