Table of Contents
Fetching ...

PROMPTHEUS: A Human-Centered Pipeline to Streamline SLRs with LLMs

João Pedro Fernandes Torres, Catherine Mulligan, Joaquim Jorge, Catarina Moreira

TL;DR

Evaluations demonstrate that PROMPTHEUS reduces review time, achieves high precision, and provides coherent topic organization, offering a scalable and effective solution for conducting literature reviews in an increasingly crowded research landscape.

Abstract

The growing volume of academic publications poses significant challenges for researchers conducting timely and accurate Systematic Literature Reviews, particularly in fast-evolving fields like artificial intelligence. This growth of academic literature also makes it increasingly difficult for lay people to access scientific knowledge effectively, meaning academic literature is often misrepresented in the popular press and, more broadly, in society. Traditional SLR methods are labor-intensive and error-prone, and they struggle to keep up with the rapid pace of new research. To address these issues, we developed \textit{PROMPTHEUS}: an AI-driven pipeline solution that automates the SLR process using Large Language Models. We aimed to enhance efficiency by reducing the manual workload while maintaining the precision and coherence required for comprehensive literature synthesis. PROMPTHEUS automates key stages of the SLR process, including systematic search, data extraction, topic modeling using BERTopic, and summarization with transformer models. Evaluations conducted across five research domains demonstrate that PROMPTHEUS reduces review time, achieves high precision, and provides coherent topic organization, offering a scalable and effective solution for conducting literature reviews in an increasingly crowded research landscape. In addition, such tools may reduce the increasing mistrust in science by making summarization more accessible to laypeople. The code for this project can be found on the GitHub repository at https://github.com/joaopftorres/PROMPTHEUS.git

PROMPTHEUS: A Human-Centered Pipeline to Streamline SLRs with LLMs

TL;DR

Evaluations demonstrate that PROMPTHEUS reduces review time, achieves high precision, and provides coherent topic organization, offering a scalable and effective solution for conducting literature reviews in an increasingly crowded research landscape.

Abstract

The growing volume of academic publications poses significant challenges for researchers conducting timely and accurate Systematic Literature Reviews, particularly in fast-evolving fields like artificial intelligence. This growth of academic literature also makes it increasingly difficult for lay people to access scientific knowledge effectively, meaning academic literature is often misrepresented in the popular press and, more broadly, in society. Traditional SLR methods are labor-intensive and error-prone, and they struggle to keep up with the rapid pace of new research. To address these issues, we developed \textit{PROMPTHEUS}: an AI-driven pipeline solution that automates the SLR process using Large Language Models. We aimed to enhance efficiency by reducing the manual workload while maintaining the precision and coherence required for comprehensive literature synthesis. PROMPTHEUS automates key stages of the SLR process, including systematic search, data extraction, topic modeling using BERTopic, and summarization with transformer models. Evaluations conducted across five research domains demonstrate that PROMPTHEUS reduces review time, achieves high precision, and provides coherent topic organization, offering a scalable and effective solution for conducting literature reviews in an increasingly crowded research landscape. In addition, such tools may reduce the increasing mistrust in science by making summarization more accessible to laypeople. The code for this project can be found on the GitHub repository at https://github.com/joaopftorres/PROMPTHEUS.git

Paper Structure

This paper contains 30 sections, 2 figures, 8 tables.

Figures (2)

  • Figure 1: The PROMPTHEUS framework consists of three phases: (1) Systematic Search and Screening using GPT and Sentence-BERT for paper selection, (2) Data Extraction and Topic Modeling with BERTopic and GPT for organizing and generating section titles, and (3) Synthesis and Summarization with T5 and GPT to refine and compile the findings into an SLR LaTeX document. This framework leverages NLP techniques and LLMs for an efficient and scalable SLR process.
  • Figure 2: Performance metrics across different document limits for GPT-3.5 and GPT-4o in the SLR process. (a) CPU Time: GPT-4o consistently requires more time than GPT-3.5 as the number of documents increases, reflecting its computational complexity. (b) Number of Topics: GPT-4o identifies more topics, indicating a finer level of clustering. (c) Topic Coherence: Coherence is stable up to 200 documents for both models, but it declines as more documents are added, suggesting overfitting or noise. (d) ROUGE Scores: Summarization quality improves and plateaus around 200 documents. (e) Cosine Similarity: Both models show stable alignment with input queries, with diminishing returns beyond 200 documents. (f) Readability Scores: Readability peaks around 200 documents before declining, suggesting this as the optimal limit for accessible summaries.