Table of Contents
Fetching ...

Automated Code Review Assignments: An Alternative Perspective of Code Ownership on GitHub

Jai Lal Lulla, Raula Gaikovina Kula, Christoph Treude

TL;DR

The paper investigates GitHub CODEOWNERS as an explicit, policy-driven code ownership mechanism, assessing its adoption, adherence, and impact on pull request dynamics and reviewer workloads at scale. Using a large, curated dataset of 645 CODEOWNERS-bearing repositories plus a baseline group, it employs regression discontinuity design to quantify causal effects around CODEOWNERS adoption. Findings show modest adherence that strengthens with higher coverage, reveal that CODEOWNERS owners are often not core committers yet behave similarly in PRs to contribution-based owners, and demonstrate a shift toward more streamlined PR processes and redistributed workload after adoption. The work highlights CODEOWNERS as a governance and security instrument with practical implications for maintainers, contributors, and researchers, and suggests avenues for further study of explicit ownership in open-source ecosystems.

Abstract

Code ownership is central to ensuring accountability and maintaining quality in large-scale software development. Yet, as external threats such as software supply chain attacks on project health and quality assurance increase, mechanisms for assigning and enforcing responsibility have become increasingly critical. In 2017, GitHub introduced the CODEOWNERS feature, which automatically designates reviewers for specific files to strengthen accountability and protect critical parts of the codebase. Despite its potential, little is known about how CODEOWNERS is actually adopted and practiced. We present the first large-scale empirical study of CODEOWNERS usage across over 844,000 pull requests with 1.9 million comments and over 2 million reviews. We identify 10,287 code owners to track their review activities. Results indicate that codeowners tend to adhere the rules specified in the CODEOWNERS file, exhibit similar collaborative behaviours to traditional metrics of ownership, but tend to contribute to a smoother and faster PR workflow over time. Finally, using regression discontinuity design (RDD) analysis, we find that repositories adopting CODEOWNERS experience shifts in review dynamics, as ownership redistributes review responsibilities away from core developers. Our results position CODEOWNERS as a promising yet underutilized mechanism for improving software governance and resilience. We discuss how projects can leverage this alternative ownership method as a perspective to enhance security, accountability, and workflow efficiency in open-source development.

Automated Code Review Assignments: An Alternative Perspective of Code Ownership on GitHub

TL;DR

The paper investigates GitHub CODEOWNERS as an explicit, policy-driven code ownership mechanism, assessing its adoption, adherence, and impact on pull request dynamics and reviewer workloads at scale. Using a large, curated dataset of 645 CODEOWNERS-bearing repositories plus a baseline group, it employs regression discontinuity design to quantify causal effects around CODEOWNERS adoption. Findings show modest adherence that strengthens with higher coverage, reveal that CODEOWNERS owners are often not core committers yet behave similarly in PRs to contribution-based owners, and demonstrate a shift toward more streamlined PR processes and redistributed workload after adoption. The work highlights CODEOWNERS as a governance and security instrument with practical implications for maintainers, contributors, and researchers, and suggests avenues for further study of explicit ownership in open-source ecosystems.

Abstract

Code ownership is central to ensuring accountability and maintaining quality in large-scale software development. Yet, as external threats such as software supply chain attacks on project health and quality assurance increase, mechanisms for assigning and enforcing responsibility have become increasingly critical. In 2017, GitHub introduced the CODEOWNERS feature, which automatically designates reviewers for specific files to strengthen accountability and protect critical parts of the codebase. Despite its potential, little is known about how CODEOWNERS is actually adopted and practiced. We present the first large-scale empirical study of CODEOWNERS usage across over 844,000 pull requests with 1.9 million comments and over 2 million reviews. We identify 10,287 code owners to track their review activities. Results indicate that codeowners tend to adhere the rules specified in the CODEOWNERS file, exhibit similar collaborative behaviours to traditional metrics of ownership, but tend to contribute to a smoother and faster PR workflow over time. Finally, using regression discontinuity design (RDD) analysis, we find that repositories adopting CODEOWNERS experience shifts in review dynamics, as ownership redistributes review responsibilities away from core developers. Our results position CODEOWNERS as a promising yet underutilized mechanism for improving software governance and resilience. We discuss how projects can leverage this alternative ownership method as a perspective to enhance security, accountability, and workflow efficiency in open-source development.

Paper Structure

This paper contains 25 sections, 2 equations, 9 figures, 8 tables.

Figures (9)

  • Figure 1: Overview of our study showing data collection, preparation, and designation to each RQ.
  • Figure 2: Distribution of Commits, Pull Requests, and Issues between repositories with and without CODEOWNERS files.
  • Figure 3: Distribution of Contributors and Age of Repositories with and without CODEOWNERS files.
  • Figure 4: Characteristics of CODEOWNERS files across repositories, showing distributions of rule line counts and file coverage.
  • Figure 5: Repository pull request merge time trend before and after CODEOWNERS adoption.
  • ...and 4 more figures