Who is using AI to code? Global diffusion and impact of generative AI
Simone Daniotti, Johannes Wachs, Xiangnan Feng, Frank Neffke
TL;DR
The study constructs a large-scale, ground-truth–grounded detector for AI-generated Python code and applies it to millions of GitHub commits across six countries to map the diffusion and productivity impacts of genAI coding tools. Using GraphCodeBERT-based classification, country-specific corrections, and fixed-effects regressions, it finds that AI-generated code reached about $29\%$ of US Python functions by 2024, boosting quarterly commit activity by roughly $3.6\%$—driven mainly by experienced programmers—and encouraging exploration of new library domains. The paper also validates the detector on real-world and newer models, analyzes cross-country patterns, and estimates the broader economic value and potential welfare gains under different general-equilibrium scenarios, concluding that genAI’s impact is substantial but highly heterogeneous. These insights inform policymakers and researchers about diffusion barriers, the distributional consequences for skills, and the scale of productivity and innovation effects in software development.
Abstract
Generative coding tools promise big productivity gains, but uneven uptake could widen skill and income gaps. We train a neural classifier to spot AI-generated Python functions in over 30 million GitHub commits by 170,000 developers, tracking how fast -- and where -- these tools take hold. Today, AI writes an estimated 29% of Python functions in the US, a modest and shrinking lead over other countries. We estimate that quarterly output, measured in online code contributions, has increased by 3.6% because of this. Our evidence suggests that programmers using AI may also more readily expand into new domains of software development. However, experienced programmers capture nearly all of these productivity and exploration gains, widening rather than closing the skill gap.
