Table of Contents
Fetching ...

On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub

Miku Watanabe, Hao Li, Yutaro Kashiwa, Brittany Reid, Hajimu Iida, Ahmed E. Hassan

TL;DR

This study empirically analyzes 567 Agentic-PRs generated by Claude Code across 157 open-source projects to understand their practicality and acceptance in real-world software development. It shows that agentic contributions skew toward non-functional improvements (tests, refactoring, documentation) while human PRs handle maintenance tasks like CI; overall, $83.8\%$ of Agentic-PRs are accepted and merged, and $54.9\%$ of merged APRs are integrated without further modification, with the remaining $45.1\%$ requiring human revisions. When revisions occur, their extent is comparable to human-generated PRs, with revisions most often addressing bug fixes, documentation, and style adherence. The findings highlight substantial productivity potential for agentic coding, but also emphasize the continued need for human oversight, project-specific guidance, and tooling to manage review, testing, and maintainability at scale.

Abstract

Large language models (LLMs) are increasingly being integrated into software development processes. The ability to generate code and submit pull requests with minimal human intervention, through the use of autonomous AI agents, is poised to become a standard practice. However, little is known about the practical usefulness of these pull requests and the extent to which their contributions are accepted in real-world projects. In this paper, we empirically study 567 GitHub pull requests (PRs) generated using Claude Code, an agentic coding tool, across 157 diverse open-source projects. Our analysis reveals that developers tend to rely on agents for tasks such as refactoring, documentation, and testing. The results indicate that 83.8% of these agent-assisted PRs are eventually accepted and merged by project maintainers, with 54.9% of the merged PRs are integrated without further modification. The remaining 45.1% require additional changes benefit from human revisions, especially for bug fixes, documentation, and adherence to project-specific standards. These findings suggest that while agent-assisted PRs are largely acceptable, they still benefit from human oversight and refinement.

On the Use of Agentic Coding: An Empirical Study of Pull Requests on GitHub

TL;DR

This study empirically analyzes 567 Agentic-PRs generated by Claude Code across 157 open-source projects to understand their practicality and acceptance in real-world software development. It shows that agentic contributions skew toward non-functional improvements (tests, refactoring, documentation) while human PRs handle maintenance tasks like CI; overall, of Agentic-PRs are accepted and merged, and of merged APRs are integrated without further modification, with the remaining requiring human revisions. When revisions occur, their extent is comparable to human-generated PRs, with revisions most often addressing bug fixes, documentation, and style adherence. The findings highlight substantial productivity potential for agentic coding, but also emphasize the continued need for human oversight, project-specific guidance, and tooling to manage review, testing, and maintainability at scale.

Abstract

Large language models (LLMs) are increasingly being integrated into software development processes. The ability to generate code and submit pull requests with minimal human intervention, through the use of autonomous AI agents, is poised to become a standard practice. However, little is known about the practical usefulness of these pull requests and the extent to which their contributions are accepted in real-world projects. In this paper, we empirically study 567 GitHub pull requests (PRs) generated using Claude Code, an agentic coding tool, across 157 diverse open-source projects. Our analysis reveals that developers tend to rely on agents for tasks such as refactoring, documentation, and testing. The results indicate that 83.8% of these agent-assisted PRs are eventually accepted and merged by project maintainers, with 54.9% of the merged PRs are integrated without further modification. The remaining 45.1% require additional changes benefit from human revisions, especially for bug fixes, documentation, and adherence to project-specific standards. These findings suggest that while agent-assisted PRs are largely acceptable, they still benefit from human oversight and refinement.

Paper Structure

This paper contains 21 sections, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Example of Claude Code refactoring and PR creation.
  • Figure 2: Example GitHub PR created by Claude Code
  • Figure 3: Overview process of data collection
  • Figure 4: Distribution of change metrics including changed files, added lines, and deleted lines in revised commits. Note that these do not include metrics from the first commit.
  • Figure 5: The number of revisions commits in Agentic-PRs and Human-PRs. The inner box shows the interquartile range, and the white lines indicate the medians.
  • ...and 1 more figures