Table of Contents
Fetching ...

Analyzing 16,193 LLM Papers for Fun and Profits

Zhiqiu Xia, Lang Zhu, Bingzhe Li, Feng Chen, Qiannan Li, Chunhua Liao, Feiyi Wang, Hang Liu

TL;DR

The paper analyzes LLM-related publication trends across 77 top CS conferences from 2019 to 2024, compiling 168,331 papers and identifying 16,193 LLM-related works. It combines keyword screening with a prompt-based classifier, topic modeling using embeddings and Ward clustering, and affiliation/nation mapping to derive insights. Results show a dominant AI/NLP focus, a dramatic surge in 2024, and a shift in leadership from industry to academia, alongside stable US/China dominance and rising activity in Hong Kong and other regions. These findings offer a cross-disciplinary view of LLM adoption and can inform strategic research directions and policy considerations.

Abstract

Large Language Models (LLMs) are reshaping the landscape of computer science research, driving significant shifts in research priorities across diverse conferences and fields. This study provides a comprehensive analysis of the publication trend of LLM-related papers in 77 top-tier computer science conferences over the past six years (2019-2024). We approach this analysis from four distinct perspectives: (1) We investigate how LLM research is driving topic shifts within major conferences. (2) We adopt a topic modeling approach to identify various areas of LLM-related topic growth and reveal the topics of concern at different conferences. (3) We explore distinct contribution patterns of academic and industrial institutions. (4) We study the influence of national origins on LLM development trajectories. Synthesizing the findings from these diverse analytical angles, we derive ten key insights that illuminate the dynamics and evolution of the LLM research ecosystem.

Analyzing 16,193 LLM Papers for Fun and Profits

TL;DR

The paper analyzes LLM-related publication trends across 77 top CS conferences from 2019 to 2024, compiling 168,331 papers and identifying 16,193 LLM-related works. It combines keyword screening with a prompt-based classifier, topic modeling using embeddings and Ward clustering, and affiliation/nation mapping to derive insights. Results show a dominant AI/NLP focus, a dramatic surge in 2024, and a shift in leadership from industry to academia, alongside stable US/China dominance and rising activity in Hong Kong and other regions. These findings offer a cross-disciplinary view of LLM adoption and can inform strategic research directions and policy considerations.

Abstract

Large Language Models (LLMs) are reshaping the landscape of computer science research, driving significant shifts in research priorities across diverse conferences and fields. This study provides a comprehensive analysis of the publication trend of LLM-related papers in 77 top-tier computer science conferences over the past six years (2019-2024). We approach this analysis from four distinct perspectives: (1) We investigate how LLM research is driving topic shifts within major conferences. (2) We adopt a topic modeling approach to identify various areas of LLM-related topic growth and reveal the topics of concern at different conferences. (3) We explore distinct contribution patterns of academic and industrial institutions. (4) We study the influence of national origins on LLM development trajectories. Synthesizing the findings from these diverse analytical angles, we derive ten key insights that illuminate the dynamics and evolution of the LLM research ecosystem.

Paper Structure

This paper contains 13 sections, 16 figures, 1 table.

Figures (16)

  • Figure 1: Number of LLM-related papers in each area.
  • Figure 2: LLM-related paper ratio across different conferences.
  • Figure 3: Top 10 topics in ACL.
  • Figure 4: Top 10 topics in EMNLP.
  • Figure 5: Top 10 topics in NAACL.
  • ...and 11 more figures