Table of Contents
Fetching ...

Explaining Explanation: An Empirical Study on Explanation in Code Reviews

Ratnadira Widyasari, Ting Zhang, Abir Bouraffa, Walid Maalej, David Lo

TL;DR

The types of explanations used in code reviews are studied and the potential of Large Language Models (LLMs), specifically ChatGPT, in generating these specific types are explored to highlight the potential for future automation in code reviews.

Abstract

Code reviews are central for software quality assurance. Ideally, reviewers should explain their feedback to enable authors of code changes to understand the feedback and act accordingly. Different developers might need different explanations in different contexts. Therefore, assisting this process first requires understanding the types of explanations reviewers usually provide. The goal of this paper is to study the types of explanations used in code reviews and explore the potential of Large Language Models (LLMs), specifically ChatGPT, in generating these specific types. We extracted 793 code review comments from Gerrit and manually labeled them based on whether they contained a suggestion, an explanation, or both. Our analysis shows that 42% of comments only include suggestions without explanations. We categorized the explanations into seven distinct types including rule or principle, similar examples, and future implications. When measuring their prevalence, we observed that some explanations are used differently by novice and experienced reviewers. Our manual evaluation shows that, when the explanation type is specified, ChatGPT can correctly generate the explanation in 88 out of 90 cases. This foundational work highlights the potential for future automation in code reviews, which can assist developers in sharing and obtaining different types of explanations as needed, thereby reducing back-and-forth communication.

Explaining Explanation: An Empirical Study on Explanation in Code Reviews

TL;DR

The types of explanations used in code reviews are studied and the potential of Large Language Models (LLMs), specifically ChatGPT, in generating these specific types are explored to highlight the potential for future automation in code reviews.

Abstract

Code reviews are central for software quality assurance. Ideally, reviewers should explain their feedback to enable authors of code changes to understand the feedback and act accordingly. Different developers might need different explanations in different contexts. Therefore, assisting this process first requires understanding the types of explanations reviewers usually provide. The goal of this paper is to study the types of explanations used in code reviews and explore the potential of Large Language Models (LLMs), specifically ChatGPT, in generating these specific types. We extracted 793 code review comments from Gerrit and manually labeled them based on whether they contained a suggestion, an explanation, or both. Our analysis shows that 42% of comments only include suggestions without explanations. We categorized the explanations into seven distinct types including rule or principle, similar examples, and future implications. When measuring their prevalence, we observed that some explanations are used differently by novice and experienced reviewers. Our manual evaluation shows that, when the explanation type is specified, ChatGPT can correctly generate the explanation in 88 out of 90 cases. This foundational work highlights the potential for future automation in code reviews, which can assist developers in sharing and obtaining different types of explanations as needed, thereby reducing back-and-forth communication.
Paper Structure (35 sections, 12 figures, 3 tables)

This paper contains 35 sections, 12 figures, 3 tables.

Figures (12)

  • Figure 1: Decision tree for useful code review based on Bosu et al. bosu2015characteristics study.
  • Figure 2: Code review example (Chromium project ID-3935924).
  • Figure 3: Example of a reviewer explanation (Chromium project ID-4585015).
  • Figure 4: Example of different types of explanation by a reviewer in one thread (Chromium project ID-4614863).
  • Figure 5: Overview of our research methodology.
  • ...and 7 more figures