Wikipedia Contributions in the Wake of ChatGPT
Liang Lyu, James Siderius, Hannah Li, Daron Acemoglu, Daniel Huttenlocher, Asuman Ozdaglar
TL;DR
This paper addresses how ChatGPT affects Wikipedia engagement by focusing on articles that are similar versus dissimilar to ChatGPT content. It employs a differences-in-differences framework, using dissimilar articles as a control and GPT-3.5 Turbo-generated encyclopedic content to quantify similarity via embedding cosines. The main findings show a significant post-launch decline in views for similar articles, with weaker and less consistent evidence for edits, indicating heterogeneous substitution by content type and article recency. The work highlights potential downstream consequences for future human-driven contributions and AI training data quality, motivating further behavioral studies and improved measures of substitutability as AI models evolve.
Abstract
How has Wikipedia activity changed for articles with content similar to ChatGPT following its introduction? We estimate the impact using differences-in-differences models, with dissimilar Wikipedia articles as a baseline for comparison, to examine how changes in voluntary knowledge contributions and information-seeking behavior differ by article content. Our analysis reveals that newly created, popular articles whose content overlaps with ChatGPT 3.5 saw a greater decline in editing and viewership after the November 2022 launch of ChatGPT than dissimilar articles did. These findings indicate heterogeneous substitution effects, where users selectively engage less with existing platforms when AI provides comparable content. This points to potential uneven impacts on the future of human-driven online knowledge contributions.
