AnalyticsGPT: An LLM Workflow for Scientometric Question Answering
Khang Ly, Georgios Cheirmpos, Adrian Raudaschl, Christopher James, Seyed Amin Tabatabaei
TL;DR
AnalyticsGPT tackles the niche problem of scientometric question answering by decomposing the task into planning, retrieval, and synthesis within a Retrieval-Augmented Generation workflow. The authors implement a fixed, modular pipeline (HLPM, DPM, AM, WM, VM) atop a LangChain-based framework and a proprietary analytics platform to enable precise entity resolution and data-grounded responses. Evaluation against a naive RAG baseline, complemented by SME and LLM-judge assessments, shows improved coverage and validity and a reduced time-to-insight in pilot studies. The work demonstrates practical implications for the science-of-science domain and offers business value for analytics platforms, while acknowledging limitations in evaluation and potential hallucinations. Overall, AnalyticsGPT provides a robust blueprint for end-to-end LLM-guided scientometric QA with tangible performance gains over simpler baselines.
Abstract
This paper introduces AnalyticsGPT, an intuitive and efficient large language model (LLM)-powered workflow for scientometric question answering. This underrepresented downstream task addresses the subcategory of meta-scientific questions concerning the "science of science." When compared to traditional scientific question answering based on papers, the task poses unique challenges in the planning phase. Namely, the need for named-entity recognition of academic entities within questions and multi-faceted data retrieval involving scientometric indices, e.g. impact factors. Beyond their exceptional capacity for treating traditional natural language processing tasks, LLMs have shown great potential in more complex applications, such as task decomposition and planning and reasoning. In this paper, we explore the application of LLMs to scientometric question answering, and describe an end-to-end system implementing a sequential workflow with retrieval-augmented generation and agentic concepts. We also address the secondary task of effectively synthesizing the data into presentable and well-structured high-level analyses. As a database for retrieval-augmented generation, we leverage a proprietary research performance assessment platform. For evaluation, we consult experienced subject matter experts and leverage LLMs-as-judges. In doing so, we provide valuable insights on the efficacy of LLMs towards a niche downstream task. Our (skeleton) code and prompts are available at: https://github.com/lyvykhang/llm-agents-scientometric-qa/tree/acl.
