Table of Contents
Fetching ...

Automated Description Generation for Software Patches

Thanh Trong Vu, Tuan-Dung Bui, Thanh-Dat Do, Thu-Trang Nguyen, Hieu Dinh Vo, Son Nguyen

TL;DR

This paper proposes PATCHEXPLAINER, an approach that addresses patch description generation by framing patch description generation as a machine translation task, and leverages explicit representations of critical elements, historical context, and syntactic conventions.

Abstract

Software patches are pivotal in refining and evolving codebases, addressing bugs, vulnerabilities, and optimizations. Patch descriptions provide detailed accounts of changes, aiding comprehension and collaboration among developers. However, manual description creation poses challenges in terms of time consumption and variations in quality and detail. In this paper, we propose PATCHEXPLAINER, an approach that addresses these challenges by framing patch description generation as a machine translation task. In PATCHEXPLAINER, we leverage explicit representations of critical elements, historical context, and syntactic conventions. Moreover, the translation model in PATCHEXPLAINER is designed with an awareness of description similarity. Particularly, the model is explicitly trained to recognize and incorporate similarities present in patch descriptions clustered into groups, improving its ability to generate accurate and consistent descriptions across similar patches. The dual objectives maximize similarity and accurately predict affiliating groups. Our experimental results on a large dataset of real-world software patches show that PATCHEXPLAINER consistently outperforms existing methods, with improvements up to 189% in BLEU, 5.7X in Exact Match rate, and 154% in Semantic Similarity, affirming its effectiveness in generating software patch descriptions.

Automated Description Generation for Software Patches

TL;DR

This paper proposes PATCHEXPLAINER, an approach that addresses patch description generation by framing patch description generation as a machine translation task, and leverages explicit representations of critical elements, historical context, and syntactic conventions.

Abstract

Software patches are pivotal in refining and evolving codebases, addressing bugs, vulnerabilities, and optimizations. Patch descriptions provide detailed accounts of changes, aiding comprehension and collaboration among developers. However, manual description creation poses challenges in terms of time consumption and variations in quality and detail. In this paper, we propose PATCHEXPLAINER, an approach that addresses these challenges by framing patch description generation as a machine translation task. In PATCHEXPLAINER, we leverage explicit representations of critical elements, historical context, and syntactic conventions. Moreover, the translation model in PATCHEXPLAINER is designed with an awareness of description similarity. Particularly, the model is explicitly trained to recognize and incorporate similarities present in patch descriptions clustered into groups, improving its ability to generate accurate and consistent descriptions across similar patches. The dual objectives maximize similarity and accurately predict affiliating groups. Our experimental results on a large dataset of real-world software patches show that PATCHEXPLAINER consistently outperforms existing methods, with improvements up to 189% in BLEU, 5.7X in Exact Match rate, and 154% in Semantic Similarity, affirming its effectiveness in generating software patch descriptions.
Paper Structure (32 sections, 3 equations, 9 figures, 9 tables)

This paper contains 32 sections, 3 equations, 9 figures, 9 tables.

Figures (9)

  • Figure 1: A patch in project FFmpeg and its description
  • Figure 2: PatchExplainer: An Automated Patch Description Generation Approach
  • Figure 3: The descriptions generated by PatchExplainer and the others for patch 8fd7839 in libarchive
  • Figure 4: Performance of PatchExplainer in different description complexity levels (left axis: BLEU & MET.; right axis: SemSim)
  • Figure 5: Patch 8e6b9ef in FFmpeg and the descriptions
  • ...and 4 more figures

Theorems & Definitions (4)

  • Definition 3.1
  • Definition 3.2
  • Definition 3.3
  • Definition 3.4