Table of Contents
Fetching ...

Mining Software Repositories for Expert Recommendation

Chad Marshall, Andrew Barovic, Armin Moin

TL;DR

The paper tackles automated bug triage and assignment in large open-source ecosystems by integrating topic modeling and supervised classification to match bug reports with developers. It advances the field by employing a per-developer BERTopic framework that leverages bug report features (product, component, priority, severity, description, comments) and developer activity to generate Top-$K$ recommendations, and by benchmarking against established baselines (TopicMiner MTM, BT-RL, LDA-SVM, BUGZIE) on Eclipse and Mozilla projects. Experimental results show that the BERTopic-based approach achieves high Top-$1$ and Top-$5$ accuracies, outperforming several baselines on average, and the study explores parameter settings, resource needs, and practical integration considerations. The work provides an open-source prototype and datasets, highlighting practical impact for improving bug triage efficiency and suggesting directions for scaling to more projects and integrating with common issue-tracking platforms.

Abstract

We propose an automated approach to bug assignment to developers in large open-source software projects. This way, we assist human bug triagers who are in charge of finding the best developer with the right level of expertise in a particular area to be assigned to a newly reported issue. Our approach is based on the history of software development as documented in the issue tracking systems. We deploy BERTopic and techniques from TopicMiner. Our approach works based on the bug reports' features, such as the corresponding products and components, as well as their priority and severity levels. We sort developers based on their experience with specific combinations of new reports. The evaluation is performed using Top-k accuracy, and the results are compared with the reported results in prior work, namely TopicMiner MTM, BUGZIE, Bug triaging via deep Reinforcement Learning BT-RL, and LDA-SVM. The evaluation data come from various Eclipse and Mozilla projects, such as JDT, Firefox, and Thunderbird.

Mining Software Repositories for Expert Recommendation

TL;DR

The paper tackles automated bug triage and assignment in large open-source ecosystems by integrating topic modeling and supervised classification to match bug reports with developers. It advances the field by employing a per-developer BERTopic framework that leverages bug report features (product, component, priority, severity, description, comments) and developer activity to generate Top- recommendations, and by benchmarking against established baselines (TopicMiner MTM, BT-RL, LDA-SVM, BUGZIE) on Eclipse and Mozilla projects. Experimental results show that the BERTopic-based approach achieves high Top- and Top- accuracies, outperforming several baselines on average, and the study explores parameter settings, resource needs, and practical integration considerations. The work provides an open-source prototype and datasets, highlighting practical impact for improving bug triage efficiency and suggesting directions for scaling to more projects and integrating with common issue-tracking platforms.

Abstract

We propose an automated approach to bug assignment to developers in large open-source software projects. This way, we assist human bug triagers who are in charge of finding the best developer with the right level of expertise in a particular area to be assigned to a newly reported issue. Our approach is based on the history of software development as documented in the issue tracking systems. We deploy BERTopic and techniques from TopicMiner. Our approach works based on the bug reports' features, such as the corresponding products and components, as well as their priority and severity levels. We sort developers based on their experience with specific combinations of new reports. The evaluation is performed using Top-k accuracy, and the results are compared with the reported results in prior work, namely TopicMiner MTM, BUGZIE, Bug triaging via deep Reinforcement Learning BT-RL, and LDA-SVM. The evaluation data come from various Eclipse and Mozilla projects, such as JDT, Firefox, and Thunderbird.

Paper Structure

This paper contains 32 sections, 5 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Eclipse bug report #1518
  • Figure 2: BERTopic Diagram
  • Figure 3: Our Construction Phase