Divergent LLM Adoption and Heterogeneous Convergence Paths in Research Writing
Cong William Lin, Wu Zhu
TL;DR
The paper addresses how Large Language Models (LLMs) influence scientific writing, with a focus on adoption heterogeneity and writing convergence. It constructs a large-scale arXiv-based dataset (2021–2023) and trains 48 discipline- and prompt-specific classifiers to detect GPT-revised abstracts, enabling analysis of adoption timing, prompts, and demographics. Key findings show substantial cross-disciplinary and demographic disparities in GPT usage, a rapid post-ChatGPT rise in adoption, and improvements in clarity, conciseness, and formality—while also pushing writing toward senior-author norms and fostering convergence across groups. The work has practical implications for equity and diversity in scholarly communication, highlighting the importance of prompt design and the potential trade-off between quality gains and homogenization of writing styles.
Abstract
Large Language Models (LLMs), such as ChatGPT, are reshaping content creation and academic writing. This study investigates the impact of AI-assisted generative revisions on research manuscripts, focusing on heterogeneous adoption patterns and their influence on writing convergence. Leveraging a dataset of over 627,000 academic papers from arXiv, we develop a novel classification framework by fine-tuning prompt- and discipline-specific large language models to detect the style of ChatGPT-revised texts. Our findings reveal substantial disparities in LLM adoption across academic disciplines, gender, native language status, and career stage, alongside a rapid evolution in scholarly writing styles. Moreover, LLM usage enhances clarity, conciseness, and adherence to formal writing conventions, with improvements varying by revision type. Finally, a difference-in-differences analysis shows that while LLMs drive convergence in academic writing, early adopters, male researchers, non-native speakers, and junior scholars exhibit the most pronounced stylistic shifts, aligning their writing more closely with that of established researchers.
