Table of Contents
Fetching ...

Mining Bug Repositories for Multi-Fault Programs

Dylan Callaghan, Bernd Fischer

TL;DR

This work addresses the limitation of single-fault benchmarks by mining real-world projects to produce true multi-fault variants of Defects4J and BugsInPy. It combines test case transplantation with fault location translation to expose and locate multiple coexisting faults without modifying the actual project code. Applied to Defects4J and BugsInPy, the approach yields averages of $9.2$ and $18.6$ faults per version, demonstrating scalability across 311 Defects4J versions and 501 BugsInPy versions. The resulting multi-fault datasets enable more realistic evaluation and training of multi-fault localization and automated debugging techniques while preserving the usability and extensibility of the original datasets.

Abstract

Datasets such as Defects4J and BugsInPy that contain bugs from real-world software projects are necessary for a realistic evaluation of automated debugging tools. However these datasets largely identify only a single bug in each entry, while real-world software projects (including those used in Defects4J and BugsInPy) typically contain multiple bugs at the same time. We lift this limitation and describe an extension to these datasets in which multiple bugs are identified in individual entries. We use test case transplantation and fault location translation, in order to expose and locate the bugs, respectively. We thus provide datasets of true multi-fault versions within real-world software projects, which maintain the properties and usability of the original datasets.

Mining Bug Repositories for Multi-Fault Programs

TL;DR

This work addresses the limitation of single-fault benchmarks by mining real-world projects to produce true multi-fault variants of Defects4J and BugsInPy. It combines test case transplantation with fault location translation to expose and locate multiple coexisting faults without modifying the actual project code. Applied to Defects4J and BugsInPy, the approach yields averages of and faults per version, demonstrating scalability across 311 Defects4J versions and 501 BugsInPy versions. The resulting multi-fault datasets enable more realistic evaluation and training of multi-fault localization and automated debugging techniques while preserving the usability and extensibility of the original datasets.

Abstract

Datasets such as Defects4J and BugsInPy that contain bugs from real-world software projects are necessary for a realistic evaluation of automated debugging tools. However these datasets largely identify only a single bug in each entry, while real-world software projects (including those used in Defects4J and BugsInPy) typically contain multiple bugs at the same time. We lift this limitation and describe an extension to these datasets in which multiple bugs are identified in individual entries. We use test case transplantation and fault location translation, in order to expose and locate the bugs, respectively. We thus provide datasets of true multi-fault versions within real-world software projects, which maintain the properties and usability of the original datasets.
Paper Structure (17 sections, 5 figures, 2 tables)

This paper contains 17 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Project layout in original Defects4J defects4j and BugsInPy bugsinpy datasets, and construction of multi-fault variants.
  • Figure 2: Average number of bugs per version, normalized by the program size of the version.
  • Figure 3: Number of tests transplanted per bug, averaged by version.
  • Figure 4: Average number of versions a particular bug is available in (y-axis in log scale).
  • Figure 5: Average number of days between the oldest version a bug is available in and the version in which the bug is fixed (bug lifetime).