Retrieval-Augmented Question Answering over Scientific Literature for the Electron-Ion Collider

Tina. J. Jat; T. Ghosh; Karthik Suresh

Retrieval-Augmented Question Answering over Scientific Literature for the Electron-Ion Collider

Tina. J. Jat, T. Ghosh, Karthik Suresh

Abstract

To harness the power of Language Models in answering domain specific specialized technical questions, Retrieval Augmented Generation (RAG) is been used widely. In this work, we have developed a Q\&A application inspired by the Retrieval Augmented Generation (RAG), which is comprised of an in-house database indexed on the arXiv articles related to the Electron-Ion Collider (EIC) experiment - one of the largest international scientific collaboration and incorporated an open-source LLaMA model for answer generation. This is an extension to it's proceeding application built on proprietary model and Cloud-hosted external knowledge-base for the EIC experiment. This locally-deployed RAG-system offers a cost-effective, resource-constraint alternative solution to build a RAG-assisted Q\&A application on answering domain-specific queries in the field of experimental nuclear physics. This set-up facilitates data-privacy, avoids sending any pre-publication scientific data and information to public domain. Future improvement will expand the knowledge base to encompass heterogeneous EIC-related publications and reports and upgrade the application pipeline orchestration to the LangGraph framework.

Retrieval-Augmented Question Answering over Scientific Literature for the Electron-Ion Collider

Abstract

Retrieval-Augmented Question Answering over Scientific Literature for the Electron-Ion Collider

Abstract

Paper Structure

Table of Contents

Figures (5)