Table of Contents
Fetching ...

GIRT-Model: Automated Generation of Issue Report Templates

Nafiseh Nikeghbal, Amir Hossein Kargaran, Abbas Heydarnoori

TL;DR

The paper tackles the limited adoption of issue report templates (IRTs) by introducing GIRT-Model, an open-source system that automatically generates customized IRTs from developer instructions. It builds GIRT-Instruct by combining GIRT-Data metadata with Zephyr-7B-Beta-generated summaries to create instruction-output pairs for fine-tuning a T5-base model. Across extensive automated and human evaluations, GIRT-Model consistently outperforms baselines (T5 and Flan-T5 variants) in generation quality and usefulness, and a user study with engineers indicates the approach is time-saving and helpful for template design. The authors release their code, dataset, and UI publicly and discuss limitations and future work, including YAML support and richer metadata integration to broaden applicability.

Abstract

Platforms such as GitHub and GitLab introduce Issue Report Templates (IRTs) to enable more effective issue management and better alignment with developer expectations. However, these templates are not widely adopted in most repositories, and there is currently no tool available to aid developers in generating them. In this work, we introduce GIRT-Model, an assistant language model that automatically generates IRTs based on the developer's instructions regarding the structure and necessary fields. We create GIRT-Instruct, a dataset comprising pairs of instructions and IRTs, with the IRTs sourced from GitHub repositories. We use GIRT-Instruct to instruction-tune a T5-base model to create the GIRT-Model. In our experiments, GIRT-Model outperforms general language models (T5 and Flan-T5 with different parameter sizes) in IRT generation by achieving significantly higher scores in ROUGE, BLEU, METEOR, and human evaluation. Additionally, we analyze the effectiveness of GIRT-Model in a user study in which participants wrote short IRTs with GIRT-Model. Our results show that the participants find GIRT-Model useful in the automated generation of templates. We hope that through the use of GIRT-Model, we can encourage more developers to adopt IRTs in their repositories. We publicly release our code, dataset, and model at https://github.com/ISE-Research/girt-model.

GIRT-Model: Automated Generation of Issue Report Templates

TL;DR

The paper tackles the limited adoption of issue report templates (IRTs) by introducing GIRT-Model, an open-source system that automatically generates customized IRTs from developer instructions. It builds GIRT-Instruct by combining GIRT-Data metadata with Zephyr-7B-Beta-generated summaries to create instruction-output pairs for fine-tuning a T5-base model. Across extensive automated and human evaluations, GIRT-Model consistently outperforms baselines (T5 and Flan-T5 variants) in generation quality and usefulness, and a user study with engineers indicates the approach is time-saving and helpful for template design. The authors release their code, dataset, and UI publicly and discuss limitations and future work, including YAML support and richer metadata integration to broaden applicability.

Abstract

Platforms such as GitHub and GitLab introduce Issue Report Templates (IRTs) to enable more effective issue management and better alignment with developer expectations. However, these templates are not widely adopted in most repositories, and there is currently no tool available to aid developers in generating them. In this work, we introduce GIRT-Model, an assistant language model that automatically generates IRTs based on the developer's instructions regarding the structure and necessary fields. We create GIRT-Instruct, a dataset comprising pairs of instructions and IRTs, with the IRTs sourced from GitHub repositories. We use GIRT-Instruct to instruction-tune a T5-base model to create the GIRT-Model. In our experiments, GIRT-Model outperforms general language models (T5 and Flan-T5 with different parameter sizes) in IRT generation by achieving significantly higher scores in ROUGE, BLEU, METEOR, and human evaluation. Additionally, we analyze the effectiveness of GIRT-Model in a user study in which participants wrote short IRTs with GIRT-Model. Our results show that the participants find GIRT-Model useful in the automated generation of templates. We hope that through the use of GIRT-Model, we can encourage more developers to adopt IRTs in their repositories. We publicly release our code, dataset, and model at https://github.com/ISE-Research/girt-model.
Paper Structure (57 sections, 2 equations, 4 figures, 2 tables)

This paper contains 57 sections, 2 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Example of IRT generation for a bug report. The upper part is the instruction section. If the user explicitly wants a field to be empty, they can add the <|EMPTY|> token. In this example, it is used for assignees. Otherwise, they can add <|MASK|> to let the model decide what to fill in, for example, in the case of about, title, and labels. It's not necessary to fill in any fields, including the summary, but doing so provides more information about what the expected IRT would be.
  • Figure 2: Top: Progression of training for loss on the validation and training sets. X-axis: Epoch number, Y-axis: Loss value. Bottom: Training progression for metric values on the validation set. X-axis: Epoch number, Y-axis: Metric value.
  • Figure 3: UI designed to interact with GIRT-Model. ①: IRT input examples ②: metadata fields of IRT inputs ④: summary field of IRT inputs ④: model config ⑤: generated instruction based on the IRT inputs ⑥: generated IRT.
  • Figure 4: Exit interview results. Participants find GIRT-Model beneficial, reporting high scores for "usefulness" and "goal achievement".