Generative AI for Pull Request Descriptions: Adoption, Impact, and Developer Interventions
Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto
TL;DR
This study analyzes the early adoption and impact of GitHub's Copilot for PRs using over 18k AI-generated PRs and 54k non-AI PRs to quantify adoption, review efficiency, and merge likelihood. Through quantitative analyses and causal inference with Entropy Balancing, the authors show that Copilot-generated PR descriptions shorten review times by about 19.3 hours and increase merge probability by ~1.57×, while developers frequently augment or modify AI content. A rich qualitative analysis reveals 13 categories of supplementary information and seven types of editorial actions, with static templates and linked references being the most common augmentations. The work provides practical recommendations for integrating Copilot for PRs with PR templates and highlights the importance of human-in-the-loop refinement in AI-assisted software development, while noting threats to validity related to early adopters and potential coding biases.
Abstract
GitHub's Copilot for Pull Requests (PRs) is a promising service aiming to automate various developer tasks related to PRs, such as generating summaries of changes or providing complete walkthroughs with links to the relevant code. As this innovative technology gains traction in the Open Source Software (OSS) community, it is crucial to examine its early adoption and its impact on the development process. Additionally, it offers a unique opportunity to observe how developers respond when they disagree with the generated content. In our study, we employ a mixed-methods approach, blending quantitative analysis with qualitative insights, to examine 18,256 PRs in which parts of the descriptions were crafted by generative AI. Our findings indicate that: (1) Copilot for PRs, though in its infancy, is seeing a marked uptick in adoption. (2) PRs enhanced by Copilot for PRs require less review time and have a higher likelihood of being merged. (3) Developers using Copilot for PRs often complement the automated descriptions with their manual input. These results offer valuable insights into the growing integration of generative AI in software development.
