Table of Contents
Fetching ...

AnalyticsGPT: An LLM Workflow for Scientometric Question Answering

Khang Ly, Georgios Cheirmpos, Adrian Raudaschl, Christopher James, Seyed Amin Tabatabaei

TL;DR

AnalyticsGPT tackles the niche problem of scientometric question answering by decomposing the task into planning, retrieval, and synthesis within a Retrieval-Augmented Generation workflow. The authors implement a fixed, modular pipeline (HLPM, DPM, AM, WM, VM) atop a LangChain-based framework and a proprietary analytics platform to enable precise entity resolution and data-grounded responses. Evaluation against a naive RAG baseline, complemented by SME and LLM-judge assessments, shows improved coverage and validity and a reduced time-to-insight in pilot studies. The work demonstrates practical implications for the science-of-science domain and offers business value for analytics platforms, while acknowledging limitations in evaluation and potential hallucinations. Overall, AnalyticsGPT provides a robust blueprint for end-to-end LLM-guided scientometric QA with tangible performance gains over simpler baselines.

Abstract

This paper introduces AnalyticsGPT, an intuitive and efficient large language model (LLM)-powered workflow for scientometric question answering. This underrepresented downstream task addresses the subcategory of meta-scientific questions concerning the "science of science." When compared to traditional scientific question answering based on papers, the task poses unique challenges in the planning phase. Namely, the need for named-entity recognition of academic entities within questions and multi-faceted data retrieval involving scientometric indices, e.g. impact factors. Beyond their exceptional capacity for treating traditional natural language processing tasks, LLMs have shown great potential in more complex applications, such as task decomposition and planning and reasoning. In this paper, we explore the application of LLMs to scientometric question answering, and describe an end-to-end system implementing a sequential workflow with retrieval-augmented generation and agentic concepts. We also address the secondary task of effectively synthesizing the data into presentable and well-structured high-level analyses. As a database for retrieval-augmented generation, we leverage a proprietary research performance assessment platform. For evaluation, we consult experienced subject matter experts and leverage LLMs-as-judges. In doing so, we provide valuable insights on the efficacy of LLMs towards a niche downstream task. Our (skeleton) code and prompts are available at: https://github.com/lyvykhang/llm-agents-scientometric-qa/tree/acl.

AnalyticsGPT: An LLM Workflow for Scientometric Question Answering

TL;DR

AnalyticsGPT tackles the niche problem of scientometric question answering by decomposing the task into planning, retrieval, and synthesis within a Retrieval-Augmented Generation workflow. The authors implement a fixed, modular pipeline (HLPM, DPM, AM, WM, VM) atop a LangChain-based framework and a proprietary analytics platform to enable precise entity resolution and data-grounded responses. Evaluation against a naive RAG baseline, complemented by SME and LLM-judge assessments, shows improved coverage and validity and a reduced time-to-insight in pilot studies. The work demonstrates practical implications for the science-of-science domain and offers business value for analytics platforms, while acknowledging limitations in evaluation and potential hallucinations. Overall, AnalyticsGPT provides a robust blueprint for end-to-end LLM-guided scientometric QA with tangible performance gains over simpler baselines.

Abstract

This paper introduces AnalyticsGPT, an intuitive and efficient large language model (LLM)-powered workflow for scientometric question answering. This underrepresented downstream task addresses the subcategory of meta-scientific questions concerning the "science of science." When compared to traditional scientific question answering based on papers, the task poses unique challenges in the planning phase. Namely, the need for named-entity recognition of academic entities within questions and multi-faceted data retrieval involving scientometric indices, e.g. impact factors. Beyond their exceptional capacity for treating traditional natural language processing tasks, LLMs have shown great potential in more complex applications, such as task decomposition and planning and reasoning. In this paper, we explore the application of LLMs to scientometric question answering, and describe an end-to-end system implementing a sequential workflow with retrieval-augmented generation and agentic concepts. We also address the secondary task of effectively synthesizing the data into presentable and well-structured high-level analyses. As a database for retrieval-augmented generation, we leverage a proprietary research performance assessment platform. For evaluation, we consult experienced subject matter experts and leverage LLMs-as-judges. In doing so, we provide valuable insights on the efficacy of LLMs towards a niche downstream task. Our (skeleton) code and prompts are available at: https://github.com/lyvykhang/llm-agents-scientometric-qa/tree/acl.
Paper Structure (24 sections, 4 figures, 4 tables)

This paper contains 24 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of AnalyticsGPT, showing the main modules: High-Level Planning Module (HLPM), Detailed Planning Module (DPM), Action Module (AM), Writing Module (WM), and Visualization Module (VM). Each module, including user input semantics and the RAG interface, is further discussed separately in Section \ref{['sec:methodology']}.
  • Figure 2: Distribution of question forms by count in the evaluation set. Note that single-intent (SING_INT) is a custom definition and not part of DBLP-QuAD. We overrepresent the fact-based category to pad the dataset with ample base cases, as users often tried to ask more complex questions.
  • Figure 3: Criteria scores separated by question form (as described in Section \ref{['sec:question_forms']}), per method. Each graph also indicates the trace of the other, for ease of comparison.
  • Figure 4: Example textual response for "What are the most relevant scholar contributions to SDGs which I should present in my 'Sustainable Energy' day at school?" (Note that we have manually censored the IDs within in-text citations to papers; these were correctly cited by the model.)