Table of Contents
Fetching ...

Have we reached the beginning of the end for review papers?

Barry Smyth, Padraig Cunningham

TL;DR

The paper analyzes the rise, value, and AI-associated disruption of review papers from 2000 to 2024 using a large-scale 17M article dataset from Semantic Scholar across 23 primary fields. It confirms substantial growth of SRs and MAs, plus NR, yet reveals a declining citation dividend, particularly for SRs and MAs, likely due to saturation and broader proliferation aided by automation tools. It documents a detectable AI signal in 2024 literature, with reviews showing stronger AI usage than regular papers and narrative reviews leading in AI signals, suggesting GenAI could further automate and transform review production. The findings imply a looming restructuring of scholarly synthesis, underscoring the need for clearer norms around transparency, authorship, and the meaning of synthesis in the age of AI.

Abstract

Review papers have traditionally enjoyed a high status in academic publishing because of the important role they can play in summarising and synthesising a field of research. They can also attract significantly more citations than primary research papers presenting original research, making them attractive to authors. There has been a dramatic increase in the publication of review papers in recent years, both in raw numbers and as a proportion of overall publication output. In this paper we demonstrate this increase across a wide range of fields of study. We quantify the citation dividend associated with review papers, but also demonstrate that it is declining and discuss the reasons for this decline. We further show that, since the arrival of GenAI tools in 2022 there is evidence of widespread use of GenAI in research paper writing, and we present evidence for a stronger AI signal among review papers compared to primary research papers. We suggest that the potential for GenAI to accelerate and even automate the production review papers will have a further significant impact on their status.

Have we reached the beginning of the end for review papers?

TL;DR

The paper analyzes the rise, value, and AI-associated disruption of review papers from 2000 to 2024 using a large-scale 17M article dataset from Semantic Scholar across 23 primary fields. It confirms substantial growth of SRs and MAs, plus NR, yet reveals a declining citation dividend, particularly for SRs and MAs, likely due to saturation and broader proliferation aided by automation tools. It documents a detectable AI signal in 2024 literature, with reviews showing stronger AI usage than regular papers and narrative reviews leading in AI signals, suggesting GenAI could further automate and transform review production. The findings imply a looming restructuring of scholarly synthesis, underscoring the need for clearer norms around transparency, authorship, and the meaning of synthesis in the age of AI.

Abstract

Review papers have traditionally enjoyed a high status in academic publishing because of the important role they can play in summarising and synthesising a field of research. They can also attract significantly more citations than primary research papers presenting original research, making them attractive to authors. There has been a dramatic increase in the publication of review papers in recent years, both in raw numbers and as a proportion of overall publication output. In this paper we demonstrate this increase across a wide range of fields of study. We quantify the citation dividend associated with review papers, but also demonstrate that it is declining and discuss the reasons for this decline. We further show that, since the arrival of GenAI tools in 2022 there is evidence of widespread use of GenAI in research paper writing, and we present evidence for a stronger AI signal among review papers compared to primary research papers. We suggest that the potential for GenAI to accelerate and even automate the production review papers will have a further significant impact on their status.

Paper Structure

This paper contains 40 sections, 1 equation, 5 figures, 22 tables.

Figures (5)

  • Figure 1: Proportion of papers by review type and field of study and FoS. The fields of study are sorted in ascending order of the overall fraction of review papers. The dashed horizontal line indicates the overall fraction of reviews, averaged across all 23 fields of study, and the numbers in brackets in the legend indicate the macro-average of the fraction of different types of reviews for different fields of study.
  • Figure 2: The overall proportion of review papers by FoS over time. The black dashed line indicates the macro-average of the fraction of reviews across all FoS. The numbers in brackets after each highlighted field name corresponds to the average year on year growth rate for that field. The inset bar-chart shows the total growth in papers between 2000 and 2024.
  • Figure 3: The main line-graph shows the Five‐year rolling median normalised citation index (NCI) by paper type (2000–2024). The bar-charts compare median NCI (with error-bars representing the 95% confidence intervals) across types for all years (a) and also for two periods (b) before and (c) after the introduction of several major software tools supporting the production of systematic and meta‐analytic reviews (annotated along the approximate timeline, e.g., DistillerdistillersrPrioritization2020, CovidenceCleo2019Usability, Rayyanoouzzani2016rayyan, ASReviewvandeschoot2020asreview).
  • Figure 4: Median excess AI-score (with error-bars representing the 95% confidence intervals) by field of study for papers with likely AI usage in 2024. Results are based on Kruskal–Wallis ($H =33{,}256.2$, $p < 0.001$, $\epsilon^2 = 0.130$) and Bonferroni-corrected Dunn’s post-hoc tests with pairwise effect sizes estimated via Cliff’s delta. Numerical labels above each bar indicate the number of other fields with significantly lower AI-scores and the mean absolute Cliff’s delta for those contrasts. The dashed line separates fields with AI-scores greater than a majority of other disciplines with medium to large average effect sizes (left) from those exceeding only a few or none (right).
  • Figure 5: The main bar-chart shows the median excess AI-scores by FoS (with error-bars representing the 95% confidence intervals) for review and regular papers with likely AI use (>$75^{\text{th}}$ percentile) with at least 100 papers in 2024, $n = 250{,}381$). Bars are labeled with significance levels (*, **, ***) denoting Bonferroni-corrected Mann–Whitney tests comparing review vs. regular papers within each field; Cliff's $\delta$ effect sizes are displayed numerically above bars for fields with significant ($p<0.01$) differences. The smaller inset bar-chart shows the median excess AI score (with error-bars representing the 95% confidence intervals) for each type of paper across all 13 FoS.