Is ChatGPT Transforming Academics' Writing Style?

Mingmeng Geng; Roberto Trotta

Is ChatGPT Transforming Academics' Writing Style?

Mingmeng Geng, Roberto Trotta

TL;DR

It is found that large language models (LLMs), represented by ChatGPT, are having an increasing impact on arXiv abstracts, especially in the field of computer science, where the fraction of LLM-style abstracts is estimated to be approximately 35%, if the responses of GPT-3.5 are taken as a baseline.

Abstract

Based on one million arXiv papers submitted from May 2018 to January 2024, we assess the textual density of ChatGPT's writing style in their abstracts through a statistical analysis of word frequency changes. Our model is calibrated and validated on a mixture of real abstracts and ChatGPT-modified abstracts (simulated data) after a careful noise analysis. The words used for estimation are not fixed but adaptive, including those with decreasing frequency. We find that large language models (LLMs), represented by ChatGPT, are having an increasing impact on arXiv abstracts, especially in the field of computer science, where the fraction of LLM-style abstracts is estimated to be approximately 35%, if we take the responses of GPT-3.5 to one simple prompt, "revise the following sentences", as a baseline. We conclude with an analysis of both positive and negative aspects of the penetration of LLMs into academics' writing style.

Is ChatGPT Transforming Academics' Writing Style?

TL;DR

Abstract

Paper Structure (29 sections, 38 equations, 8 figures, 2 tables)

This paper contains 29 sections, 38 equations, 8 figures, 2 tables.

Introduction
Data
arXiv dataset
English word frequency
Observations and analysis
Changes in word frequency
LLM simulations
LLM impact
Simple model
Noise model
Impact estimation and bias analysis
Calibration and test
Results
Calibration and test results
Estimation from real data
...and 14 more sections

Figures (8)

Figure 1: Word frequency changes in abstracts. The vertical red dashed line demarcates the first time period after ChatGPT's release.
Figure 2: Test results for simulated admixtures of abstracts in period 14. The error bars represent the standard deviation of the estimation results, and the red star is the estimated value of $\eta_n'$ from test data based on optimal $I_j$ with the same mixed ratio $\eta_n$ as in the calibration data. The orange dashed lines correspond to perfect estimation.
Figure 3: Estimates of $\eta_j(t)$ (i.e., ChatGPT impact) from real data. Word frequencies were normalized on the number of abstracts in each period before the estimation was performed. The error bars represent the standard deviation of the estimation results, using 11 different word sets $I_j$ obtained in the calibration procedure with 11 different $\eta_n$. The points of the triangle represent the average of the 3 estimates, corresponding to the 3 word selection requirements $q$ based on the 3 $\eta_n$ closest to the mean of the previous 11 estimates.
Figure 4: Words with the highest change rate in frequency
Figure 5: Word frequency changes (with different normalization) in abstracts.
...and 3 more figures

Is ChatGPT Transforming Academics' Writing Style?

TL;DR

Abstract

Is ChatGPT Transforming Academics' Writing Style?

Authors

TL;DR

Abstract

Table of Contents

Figures (8)