Table of Contents
Fetching ...

AI-assisted Programming May Decrease the Productivity of Experienced Developers by Increasing Maintenance Burden

Feiyang Xu, Poonacha K. Medappa, Murat M. Tunc, Martijn Vroegindeweij, Jan C. Fransoo

TL;DR

This study investigates how AI-assisted programming with GitHub Copilot affects OSS development, using a Difference-in-Differences design around Copilot’s technical preview to compare treatment (endorsed languages) and control groups. At the project level, Copilot is associated with measurable productivity gains (lines added, commits, PRs), yet PR rework increases, signaling lower initial code quality and higher maintenance burden. At the individual level, productivity gains are driven by peripheral contributors, while core contributors experience reduced direct coding activity and a rising maintenance load (more PR reviews and broader repository coverage), culminating in a 19% drop in core commits alongside a 6.5% rise in reviews. Overall, the results suggest AI-era productivity gains may mask growing maintenance burdens on a shrinking pool of experienced maintainers, with implications for OSS sustainability and software governance in broader ecosystems.

Abstract

Generative AI solutions like GitHub Copilot have been shown to increase the productivity of software developers. Yet prior work remains unclear on the quality of code produced and the challenges of maintaining it in software projects. If quality declines as volume grows, experienced developers face increased workloads reviewing and reworking code from less-experienced contributors. We analyze developer activity in Open Source Software (OSS) projects following the introduction of GitHub Copilot. We find that productivity indeed increases. However, the increase in productivity is primarily driven by less-experienced (peripheral) developers. We also find that code written after the adoption of AI requires more rework. Importantly, the added rework burden falls on the more experienced (core) developers, who review 6.5% more code after Copilot's introduction, but show a 19% drop in their original code productivity. More broadly, this finding raises caution that productivity gains of AI may mask the growing burden of maintenance on a shrinking pool of experts.

AI-assisted Programming May Decrease the Productivity of Experienced Developers by Increasing Maintenance Burden

TL;DR

This study investigates how AI-assisted programming with GitHub Copilot affects OSS development, using a Difference-in-Differences design around Copilot’s technical preview to compare treatment (endorsed languages) and control groups. At the project level, Copilot is associated with measurable productivity gains (lines added, commits, PRs), yet PR rework increases, signaling lower initial code quality and higher maintenance burden. At the individual level, productivity gains are driven by peripheral contributors, while core contributors experience reduced direct coding activity and a rising maintenance load (more PR reviews and broader repository coverage), culminating in a 19% drop in core commits alongside a 6.5% rise in reviews. Overall, the results suggest AI-era productivity gains may mask growing maintenance burdens on a shrinking pool of experienced maintainers, with implications for OSS sustainability and software governance in broader ecosystems.

Abstract

Generative AI solutions like GitHub Copilot have been shown to increase the productivity of software developers. Yet prior work remains unclear on the quality of code produced and the challenges of maintaining it in software projects. If quality declines as volume grows, experienced developers face increased workloads reviewing and reworking code from less-experienced contributors. We analyze developer activity in Open Source Software (OSS) projects following the introduction of GitHub Copilot. We find that productivity indeed increases. However, the increase in productivity is primarily driven by less-experienced (peripheral) developers. We also find that code written after the adoption of AI requires more rework. Importantly, the added rework burden falls on the more experienced (core) developers, who review 6.5% more code after Copilot's introduction, but show a 19% drop in their original code productivity. More broadly, this finding raises caution that productivity gains of AI may mask the growing burden of maintenance on a shrinking pool of experts.

Paper Structure

This paper contains 12 sections, 3 equations, 5 figures, 11 tables.

Figures (5)

  • Figure 1: The workflow of OSS projects. It comprises of a primary branch of a GitHub (project), which typically contains the main source code that serves as the foundation for new feature development, bug fixes, and updates. Changes to this branch are usually controlled through a structured review process conducted by maintainers to ensure code quality and prevent issues. To contribute to the project, a PR is submitted to propose changes to a project. It allows developers to submit modifications, request feedback, and merge updates into the main branch. The review and refinement process comprises of two activities that we measure in our study - pull request review (PR Review) and pull request rework (PR Rework). Eventually the reviewed and reworked changes will either be merged into the main branch or undergo additional rounds of review and rework.
  • Figure 2: Parallel trends and dynamic effects of the Copilot treatment for pull request rework. The horizontal axis represents the months relative to introduction of Copilot, while the vertical axis shows the estimated coefficients with confidence intervals (95%). The coefficients for the pre-treatment periods (leads) are statistically insignificant, indicating that there are no systematic differences in trends between the treated and control groups before the introducing of Copilot. This suggests that the parallel trends assumption holds, supporting the validity of our DiD estimation.
  • Figure 3: Histogram of contributions for core and peripheral contributors: The core contributors decreased the development activities after the deployment of Copilot. The peripheral contributors displayed opposite behaviour.
  • Figure 4: Contribution activities analysis by contributor subgroup. Panels show estimated coefficients (converted to %) from DiD regressions with 95% confidence intervals, capturing the relative change in activity post-Copilot exposure compared to control repositories. (a) Commits: Conversely, commit activity declines progressively with contributor experience, with the top 25% experiencing a 19% reduction, suggesting reduced hands-on coding engagement. (b) Pull Requests: The most peripheral contributors (0–25%) significantly increase their PR submissions (17.7%), highlighting increased participation from less experienced developers. (c) PR Reviews: The top 25% of contributors (core) exhibit a significant increase in review activity, suggesting a shift of responsibility towards the core contributors. (d) PR Reviewed Repositories: Similarly, only the core contributor group shows a meaningful rise in the number of distinct repositories reviewed, indicating a broader oversight role.
  • Figure 5: The timeline of our study period: the technical preview of GitHub Copilot was launched on June 29, 2021, initially endorsing five programming languages -- Python, JavaScript, Ruby, TypeScript, and Go. We designed our study to include the 12 months before and the 12 months after Copilot's introduction.